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Description 

[00011 Thisinvention relates to thefield of biology, more particularly to the field of recombinant DMA technology The 
Invention relates to novel recombinant recombinase proteins and their use to catalyse DNA site-specific recombination. 
. Th invention also relates to methods for site-specific recombination in vitro or in vivo, as well as do vouches and 
vectors which can be employed in such methods. The proteins, vectors and methods of the present .nvent.on can be 
used to stablv inteqrate a DNA sequence of interest into a recipient DNA. 

0002? SanipulaLs of gene expression involve the introduction of transfecied genes (iransgenes) to confer some 
novel property upon, or to alter some intrinsic property, of cells or organisms. The efficacy of such manipulator, .* 

•o Ten impaired by such problems as the inability to control chromosomal site of transgene integration or the nabrii* 
to control the number of copies of a transgene that integrate at the desired chromosomal s,te. or by the d fficu.Ue n 
controllinq the level temporal characteristics, or tissue distribution of transgene expression, or by the difficulty of mod- 
if Ing r e s^udure of Iransgenes once they are integrated into mammalian chromosomes. Thus the basic problems 
are inability to insert and retrieve any DNA segment with precision . good yields, efficiency and fatality. 

,5 00 3 SHe-specific recombination systems constitute means to circumvent the ^^^V"**^* 
mainly two systems are routinety.used to perform site-specific recombination in DNA ; there may be mentioned the 
cTe/Lox (Sauer 199^ 

wnfch cZZ lie recombination reaction between two short recognition DNA sequences It has been shown that 
Se s^specinc recombination systems, although of microbial origin for the majority, could funcbonm higher euKary- 
20 otes such as plants, insects and mice (Sauer. 1994; Rajewskyef a/.. 1996; Sauer. 1998). 

mm AddiUonal site-specific recombination systems based on transposes have been developed such as he s.te- 

SciBCoIi^ 

(Kim et al 2000), the integrons recombination system (Ploy et ai, 2000). 

25 00051 BesWe the prior art recombination systems, there remains a need for an additional improved site-specific 
olina n system to allow the targeting of additional sites, .ndeed existing site-specific ^^^^ 
are limited to target sequences which contain the recognition site of the particular recombina .on system. In th system 
d scribed in the invention the recognition site can be modified or exchanged with new argeting sites. Such addtona, 
site-specific recombinations systems would allow the simultaneous manipulate of Iransgenes ,n a host cell. The 

30 problem to be solved is to develop new site-specific recombinases with new targeting s.te(s) 

[0S06] The generally employed site-specific recombination techniques utiiize the DNA binding ch «eM* £ he 
given recombLse used. In accordance with the present invention, the inventors propose to ad new DNMMdma 
specificity to transposases and recombinases by adding/replacing their DNA b.ndmg domains with that of heterologous 
proteins An advantage of the system of the invention is that the multiplicity of recombinase and transpose genes 

35 a Se in the nature allow to create a multiplicity of chimeric recombinant site-specific recombinases that catalyse 
^combination at different site-directed recombinase targeting sequences (SORTS). Moreover, another advantage of 
Z present invention-based recombination technology is, that unlike the "cut and paste" type transposons here s no 
Sze SsSon of the DNA used as donor, allowing large DNA constructs (large plasmids, cosmids. PACs) to be inte- 

« in ?he a ?nvlntio n n provides a fusion protein comprising at least (i) a site-directed recombinase protein or a frag- 

ment or a variant thereof and. (ii) a heterologous protein, or a fragment or a variant thereof, that binds, e-ther directly 
rlbX^IA - least at one responsive element (RE), said fusion protein having a hetero ogous site-direrfed 
Lombinase activity such that in a cell or a cell free system containing sale I responsive : element, said fus.or , protein 
binds directly or indirectly to said responsive element and catalyzes recombination ,n the vic.n.ly or at sa.d DNA re- 

.5 ^ onsive element in presence of site-directed recombinase targeting sequence (SORTS). In a preferred embodiment, 
only a fragment corresponding to a functional DNA binding domain of the heterologous protein ,s fused to the site- 
d reeled recombinase protein, or fragment or a variant thereof. In a second preferred embodiment, only a fragment 
corresponding to the functional site-directed recombinase domain of said site-d.rected recombinase ,s fused to the 
heterolooous protein, or a fragment or a variant thereof. , . . lt . 

60 pq "Heterologous protein- refers to protein, which is preferably different from the protein used to bnng the site- 
specific recombinase activity to the fusion protein of the invention. . nr 

[0009] Preferably, the endogenous DNA-binding domain of said site-directed recombinase protein or a fragmen : o 
a variant thereof, is no more funcUonal in the fusion protein such that the fusion protein of the mvenbon does not 
anymore catalyze recombination at the natural endogenous targeting sequence of said site-directed recombinase pro- 

55 tein. The non-functional DNA binding domain of said endogenous sile-directed recombinase has been rendered non- 
functional either by partial or total deletion of said endogenous DNA-binding domain, either by mutation of sa.d en- 
doaenous DNA-binding domain, or by the fusion to the heterologous DNA-binding domain. 

?0010] Alternatively. U could be of interest that the fusion protein of the invention retains totally or partially its ab.l.ty 
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to bind DNA at its endogenous DNA-binding domain and to catalyse both recombination at the natural endogenous 
targeting sequence of said site-directed recombinase protein and at the heterologous targeting sequence. 
mm The expression "recombinase fragment" is understood to mean any recombinase poroon exhibiting at teast 
Re combinase activity. The expression "heterologous protein fragment" means any porfon of the heterologous 
protein exhibiting at least a site-specific DNA-binding activity either directly or indirectly. 

?0012] A heterologous protein, or a fragment or a variant thereof that binds directly to DMA means that this protein 
s d rectly in contact with the double helix by recognizing a DNA sequence motif or a DNA conformation^ bind tog 
between the protein and the DNA sequence is performed by the means of a series of weak bindings between the 
pSn amino acids and the DNA bases. Such protein comprises conserved consensus regions easily "cognate 
by a In skilled in the art. Among those consensus regions, one can recite for example the zinc-finger motif, the 
leucine zipper motif, the helix-turn-helix motif. . nrr> . ain 

[0013] A heterologous protein, or a fragment or a variant thereof that binds indirectly to DNA . m.« th ^ P"** 
binds to DNA through at least one additional protein. This additional protein comprises a DNA-bmding domain and a 
Drolein-binding domain to interact with the heterologous protein of the invention. , H v,fho 
[0014 The recombinase activity of thefusion protein of the invention, or fragments thereof, can be evaluate. by he 
estimation of the frequency of recombination events catalyzed by said recombinase. This is estimated by techniques 
known to persons skilled* the art. These events may be revealed by PGR or Soutoern Blotting; he relation 
frequency being estimated by taking the ratio cf the representation of the various alleles ,n the cells of a tissue. The 
frequencTes of the various alleles may be estimated by assaying the intensity of the corresponding bands on an elec- 
SSSirf product of PGR amplification or of genomic DNA (Southern blotting). The use of the PGR makes 
Ss metood of estimation extremely sensitive and makes it possible to detect the presence of cells •* 
whose genome has not undergone targeted site-specific recombination. Another way of estimating the efficiency o 
me recombination may be carried out indirectly by immunohistochemistry, by analyzing for example the expression of 
the gene sequence that have been inactivated by the site-specific recombination event. 

IDD151 By "fragments of thefusion protein of the invention", it Is' meant any biologically active part of the fusion protein 
able to catalyze at least site-directed DNA recombination at the heterologous targeting sequence, 
foil 6^ The site-specnc DNA binding activity of said heterologous protein, or fragment thereof can be eva ^uated by 
techniques commonly used by the man skilled in the art, such as for examples, footpnnt.ng assay, gel shift assay, 

STSrSuted recombinase part of the fusion protein of the invention is se.ec tod *' 
directed recombinase and eukaryotic site-directed recombinase. More preferably, said site-d.rected recombinase is 
selected in the croup comprising transposase and DNA recombinase. 

0018? By "t ra g ns P o P sase" P it is meant a protein, which serves to insert transposab.e elements . « to toe genome nrf the 
recipient organism. The transposase protein recognizes specific DNA sequences (e.g. inverted re P ea ^ nd ^ S ° RTS ) 
and can caTalyse the recombination reaction by joining these specific SORTS sequences to target DNA. Target DNA 
can be a random DNA sequence or a more or less specie target site (e.g. transposition hot spot). Transposases are 
membe^ofac,assof enzymes w^ 

located on the same molecule as the transposase gene(s) itself. 

0019] ll ^recombinase". it «. meant a protein, which catalyses rearrangement of DNA sequences. There can be 

e olnases that catalyse homo.ogous recombination (e.g. RecABC system of E. cCi) or site 

to att/int system of lambda bacteriophage ;Tn3 resolvase family of bacteria; Cro recombinase of P1 bacteriophage. 

FLP recombinase of saccharomycas ; V(D)J recombinase RAG3 of the immune system, e to.). 

[002o7 Any transposase is in the scope of the present invention. In a preferred embodiment sac ta«u». is 

selected from the transposases encoded by sequences derived from a transposable element selected from the group 

consisting of: 

: ^^^^^Z. -S10L. .S10UR-2. IS26, IS30. IS50 IS150. ,8186 .3911 and 

- Tkaryotic mobile elements Td of CaenorhabMs elegans, the P and Copia elements of DrosopMa, the synthetic 
etoS Sleeping Beauty, Hobo, Mariner, Ac-Ds Spm, the Mu e.ements of maize, Ty1 , Ty2. Ty3 elements of yeast. 
Tam1 Tam2 and Tam3 elements of Antirrhinum, Tx1 and Tx2 elements of Xanopus ; and 

- retrotransposons such as blood, gypsy, springer and beagle elements of Drosophila; and 

- inlegrases of retroviruses RSV. SNV. Mo-MLV. MMTV. HTLV1 and HIV1; 

and their fragments or variants thereof. . . . , „ n o T! , 

[0021] More preferably, the transposase is selected from the transposases abto ^^VS^T^ 
seouences such as sequences derived from procaryotic mobile elements IS1. IS2. IS3, IS5. IS10L, IS10UR 2, ISZb 
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[0022] Alternatively, the recombinase is a transposase selected from the transposases encoded by sequences able 
to form a covalently joined SDRTS sequences such as eukaryotic mobile elements Tc1 of Caenorhabdrtts elegans, the 
P Inc Tcopia elements of Drosophila. Ac-Ds Spm and Mu elements of maize. Ty1 . Ty2, Ty3 elements of yeast. Tam1 , 
Tam2 and Tam3 elements of Antirrhinum. Tx1 and Tx2 elements of Xenopus. • iL . . ^h-^ 

Ssi In another embodiment, the recombinase is a DNA recombinase selected from the group of s,te-d,rected 
recombinases comprising the Cre recombinase of bacteriophage P1. the FLP recombinase of Saccharomycesca^ 
the R recombinase of ZygosaccAaromyces rouxii pSR1 , the A recombinase of Kluyvercmyces drosoptvtanum 

T^theA^^^^^^^^ 

system of fhe^Mu bacteriophage^he recombinase of the GIN recombination system of P1 bacteriophage, the recom- 
Snase of the MIN recombination system of P1-15 bacteriophage, the bacterial p recomb.nase, the invertase o the H,n 
sylte^ 

SSTfS Heterologous protein, or a fragment or a variant thereof comprises at least a DNA «" 
heterologous protein being selected among site-speciffc: recombinases. transposases. P™^^ 
caryoticactivatorpmteins.pmcaryoticreprBssorprotelns.eukaryotictranscriptionfacto 

ucteases, cSroma.^proteins. or a fragment or a variant thereof. In a preferred embodiment, the ^ he^ologous 
poListheproca^ 

em o imen . said heterologous protein is the eukaryotic zinc-finger transcription factor Gli. 

thereof In another preferred embodiment said heterologous protein is the Tet repressor, encoded by the tetracycline 
resSnce gene, or fragments orvariants thereof. In another preferred embodiment, said heterologous protein, or a 
fragment o?a variant thereof is factor that binds DNA indirectly and k se.ected among prolems that are known to 
associate with DNA binding factor such as co-activalors or co-repressors of transection factors : 

25 such as TATA binding protein (TBP) associated factors <TAF,| S ) {Bell et at.. 1999) ; 

- fa"oi associated wKh the RNA polymerase II Mediator complex, such as TRAPs (a.so called DRIPs or MEDs) 
{Malik ef al.. 2000) ; 

- histone deacetylases (HDAC1 , HDAC2. HDAC3 and HDAC8) {Exp. Cell. Res.. 2001) ; 

- histone acetyltransferases (GCN5, PCAF, P 300. CBP. TIF2. SRC-1) (Marmorste.n et al., 2001) ; 

3D . histone methyltransferases(Jenuwein, 2001) ; ™»ki«^fa*«« 

- transcription factors such as smad2 that do not bind DNA directly but in conjunct.onw.th other DNA b.nd.ng factors 
of TGF beta signal transduction pathway. 

[00251 It is also in the scope of the invention to use variants of the site-specific recombinase and the heterologous 
35 protein. The expression 'Variant- is understood to mean all the wild-type recombinases or hetero P"^ . or 
fragments thereof, which may exist naturally and which correspond in particular to truncations, 
and/or additions of amino acid residues. These recombinases, heterologous protons an fragments thereof are pref- 
erably derived from the genetic polymorphism in the population. The expression 'Var.anf' .s also understood to mean 
the synthetic variants for which the above mediations are not natural^ present, but were introduced arufic.al*. by 
40 oenetic engineering, or were chemically induced for example. 

[0026] The DNA responsive element is preferably selected among operator regions of procaryot.c repressors, meth- 
Vlation sites of sequence-specific methylases. recognition sites of restriction endonucleases, b.nd.ng srtes of host fao 
tors, recognition sequences of site-specific recombinases. hot spot sequences for ^"sposases. transenpbon factor 
responsive elements (e.g. GC box of SP1 transcription factor, CAAT Box, octamer responsive element...), the phase 
X operator region, the CpG island, LHS. GOHS, IS30 recognition sequence, (IS30) 2 . The heterologous protem com- 
prising a DNA binding domain that binds to the DNA responsive element is chosen accordingly. In a preferred embod- 
iment Mid DNA responsive element is the operator region of phage Xcl repressor, and the heterologous prote.n « the 
procaryotic repressor protein cl of the lambda phage, and the variants thereof. 

100271 in one embodiment, the fusion protein of the invention comprises the cl repressor of the lambda phage or a 
fragment or a variant thereof (as a heterologous protein) and the IS30 transposase from Escherichia coh or a fragment 
oTa variant thereof (as a transposase). In a more preferred embodiment, the fusion protein of the mvent.on is of se- 

[002bT S in a Second embodiment, the fusion protein of the invention comprises the DNA-binding domain of Gli tran- 
scription factor or a fragment or a variant thereof (as a heterologous protein) and the IS30 transposase from Eschench.a 
co/« or a fragment or a variant thereof (as a transposase). In a more preferred embodiment, the fus.on prote.n of the 
invention is of sequence SEQ ID N°4. 

T00291 In a third embodiment, the fusion protein of the invention comprises the Tet repressor or a fragment or a 
variant thereof (as a heterologous protein) and the IS30 transposase from Escherichia coli or a fragment or a variant 
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thereof (as a transposase). In a more preferred embodiment, the fusion protein of the invention is of sequence SEQ 

SoTo] 6 ' in a preferred embodiment, the fusion protein of the invention comprises the IS30 t^sposase from 
™ L co// or a fragment or a variant thereof. In a more preferred embodiment, the fus.on protein of the invention 
comprises the first 39 amino acids of the IS30transposase. 

So31 By "fusion protein catalyzes recombination in the vicinity or at said DNA response elemen .t is ^meant that 
is integrated in the DNA responsive element or at a distance s ^ ta „ 10 bases ip» 
(bp). 1000 bp, 800 bp, 750 bp, 500 bp, 400 bp. 300 bp, 250 bp, 200 bp, 150 bp, 100 bp. 75 bp, 50 bp. 40 bp. 35 bp, 
30 dd 25 dd 20 bp, 15 bp, 10 bp, or smaller than 5 bp. 

?ooS The fustan 'pnJn of the invention can be produced by chemical synthesis or by a b 

preferably the fusion protein of the invention is biologically produced by the means of recombinant DNA technologras 

fntorSn^ 

coli a yeast, a protozoan), a plant or cell derived from a multicellular orgamsm such as fungus, an .nsect cell, a plant 
cell! a mammalian rail or a rell line, which carries an expression system as defined below. n(hD , ir . mnH 
?0033] The polypeptide produced as described above may be subjected to post-translational or post-synthetic mod- 
fications asams^tofthelaltreatment.chemiraltreatmentlfonnaldBhyde.glutaraldeh 

STases pSnases and protein modification enzymes). As an example, glycosylate is often achieved when the 
Sy^ 

SSSSSL with amino acid residues Asn, Ser, Thr or hydroxy.ysine. Such modifications of Uje Pro e.nJus.on 
of the invention may be useful for example to enhance the half-life of this polypeptide in an orgamsm or in cells, to 
enhance the solubility of this polypeptide, or to facilitate the purification of the protein fusion of the invention (e.g. 

ISmT'as used herein, by "protein", "peptide" or "polypeptide" is meant any chain of amino acids, regardless of length 
or post-translational modification (e.g., glycosylation or phosphorylation). 

[0035] The present invention also relates to the recombinant polynucleot.de encod.ng for a proton fusio id _the 
Invention. In one embodiment, the invention provides recombinant polynucleotides. * ^uenceSE m ID N 1. SEQ ID 
N° 3 and SEQ ID N° 5 encoding respectively for the fusion proteins of sequence SEQ ID N 2, SEQ ID N 4 and SEQ 

i?036] 6 MdltionalyThe invention provides a recombinant polypeptide comprising a polynucleotide selected among : 

a) the Dolvnucleotides of the invention ; 

b the polynucleotides presenting at least 70% identity after optimal alignment with the polynucleot.de of step a) ; 
c) fragments of polynucleotides of the invention encoding for a peptide having at least a s.te-d.racted recomb.nase 

dTSomplementary sequence or RNA sequence corresponding to a polynucleotide of step a), b) or c). 

[0037] As used herein, "percentage of identity" between two nucleic acids sequences or two amino acids sequences, 
means the percentage of identical nucleotides, respectively aminoacids. between the two sequences to be compared. 
ranedwLhebe^ 

these two sequences being randomly spread over the nucleic acids or aminoacids sequences. As used herein be. 
alignment" or "optimal alignment", means the alignmenlfor which the determined percentage of .dentrty (see below) 
b Te highest Sequence" comparison between two nucleic acids or amino acids sequences are usually reahsed by 
comparing these sequences that have been previously aligned according to the " f ^"7*^"£^ 
realised on segmenU of comparison in order to identify and compared the ^^™«*^;?^™ZZ 
alignment to perform comparison can be realised, beside by a manual way. by us.ng the toe al 4» onUun 
developed by Smith and Waterman (1 981 ). by using the local homology algorithm developed by Neddleman and Wun- 
sch (1970) by using the method ofsimilarif.es developed by Pearson and Lipman (1988), by us.ng computer softwares 
Sag such iorithms (GAP, BESTFIT. BLAST P. BLAST N, FASTA, TFASTA in the Wisconsin Genetics ; software 
Package.GeneucsCompu^^^ 

used BLAST software, with the BLOSUM 62 matrix, or the PAM or PAM 250 matrix. The ident.ty percentage behveen 
two sequences of nucleic acids or amino acids is determined by comparing these two sequences opt.rr.ally aligned, 
the nucleic acidsorthe aminoacids sequences being able to comprise addmonsordeletions.n respect to the referenc^ 
sequence in order to get the optima, alignment between these two sequences. The percenta ge of .dent£ 
by determining the number of identical position between these two sequences, and div.dmg the n ^*£^ 
number of compared positions, and by multiplying the result obtained by 100 to gel the percentage of rfenuty between 

!o03B] WO Ai e u q se e d n herein nucleic acids sequences having a percentage of identity of at least 70% preferably at least 
72%, 75%. 77%, 80%. 82%, 85%. 87%, 90%. 92%, 95%. 97%, 98%. 99%, 99.5% after optimal alignment, means 
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nucleic acids sequences having with regard to the reference sequence, modifications such as 
insertions chimeric fusions, and/or substitutions, specially point mutations, the nucleic sequence of which Presenting 
at least 70%. preferably, at least 72%, 75%. 77%, 80%. 82%, 85%, 87%. 90%. 92%. 95%, 97%. 98%, 99%, 99.5% 
identity after optimal alignment with the nucleic acid sequence of reference. ^.m* *t 

s [0039] The invention a.so furnishes a DNA cassette comprising a promoter operably l.n ed to a J 
L invention. A DNA cassette is a nucleic acid construct generated recorn inanity or ^^^^ 
specffied nucleic acid elements which permit transcription of a particular nucleic acid In a , tamet xeH. The DNA xassette 
can be part of a vector or a nucleic acid fragment. Among these specified nucleic acid elements one can recrt he 
"promoter" sequence which a DNA regulatory region capable of binding RNA polymerase in a cell i and in, 
10 t ra nscriptionofadownstream(3direction)codingsequ B n^ 

initiation site, as well as protein binding domains responsible forthe binding of RNA polymerase. Eukaryotic promoters 
will often, but not always; contain TATA boxes and CAT boxes. Various promoters, including ubiquitous or ^"»P^ 
promoters, and inducible and constitutive promoters may be used to drive the express.on of the protein fus,o .gene 
of the invention. More preferably, the promoter is inducible by an inducing stimulus to cause the transcription of sa.d 

15 fusion protein encoding gene or said polynucleotide. 

[00401 -operativeiy linked" as used herein, includes reference to a functional hnkage between a promoter and a 
Lcond sequence (i.e. a polynucleotide of the invention), wherein the promoter sequence initiates and I mediates the 
transcription of said DNA sequence corresponding to the second sequence. Generally, operative y linked means that 
the nucleic acid sequences being linked are contiguous and, where necessary to joint wo protein coding regions 
contSuous and in the same reading frame. According to the present invention, the nuclei acid sequences * .coding 
for the site-specific recombinase protein {or a fragment or a variant thereof) and for the heterologous protein (or a 
fragment or a variant thereof) are operativeiy linked to generate a polynucleotide of the mvention. 
100411 It is also a goal of the invention to furnish a vector comprising a polynucleotide of the invention i.e. a cloning 
vector) or an expression vector comprising at least one DNA cassette of the invention. A " vector 'is a replicon inwhich 
another polynucleotide segment is attached, so as to bring the replication and/or express.on to the attached segment. 
Examples of vectors include plasmids. phages, cosmids. phagemid. yeast artificial chromosomes (VAC), bacteria 
artificial chromosomes (BAC), Phage P1 artificial chromosomes (PAC), human artificial chromosomes (HAC). viral 
veSs. ch as adenoviral victor*, retroviral vectors, adeno-associated viral vectors and other DNAsequences whic 
are able to replicate or to be replicated in vitro or in a host cell, or to convey a desired DNA segment to a desired 
location within a host cell. A vector according to the invention is preferably derived from a viral vector more preferably 
an adenoviral, retroviral, or adeno-associated viral vector. The size of the vector is not cnticaL However it is preferably 
between about 5 and about 50 kilobases and more preferably between about 8 and about 20 kilobases. A vector can 
have one or more restriction endonuclease recognition sites at which the DNA sequences can be cu in a detenm nab e 
fashion without loss of an essential biological function of the vector, and into which a DNA ^^^J^" 
order to bring about its replication and cloning. Vectors can further provide primer sites (e.g. for PCR), transcnptional 
and/or Iranslational initiation and/or regulaUon sites, recombinational signals, replicons, selectable markers etc .The 
vector can further contain a selectable marker suitable for use in the identification of cells transformed with the vector. 
The vector is either a circular or linear molecule. 

[00421 The fusion gene encoding the fusion protein of the invention present In a DNA cassette or in an expression 
vecto can be expressed using any suitable expression sequences. Numerous expression sequences are known and 
a be used for expression of the fusion gene. Expression sequences can generally be classified as 
minators.and.foruseineukaryoticcells, enhancers. Expression in prokaryoUc celte also requires a « Sh,ne-Da/garno* 
sequence just upstream of the coding region for propertrenslation initiation. Promo^^^ 

hosts illustratively include the p-lactamase and lactose promoter systems, tetracycline (let) promoter, alkaline phos- 
phatase promoter, the tryptophan (trp) promoter system and hybrid promoters such as the tac promoter. However 
other functional bacterial promoters are suitable. Their nucleotide sequences are generally « J™ *» 

sequences for use with yeast hosts include, for example, the promoters for 3-phosphoglycerate kinase, enolase. glyc 
eraldehyde-3-phosphate dehydrogenase, hexokinase. Examples of inducible yeast promoters suitable for use hv the 
vectors of the invention include, for example, the promoter regions for alcohol dehydrogenase 2 >»^™» °; 
acid phosphatase. Yeast enhancers also are advantageously used with yeast promoters Preferred promotersfo use 
in mammalian host cells include promoters from polyoma virus. Simian Virus 40 (SV40). a denovirus ^uses 
hepatitis B virus, herpes simple, virus (HSV). Rous sarcoma virus (RSV). mouse mammary tumor a 
mosi preferably cytomegalovirus (CMV). or from heterologous mammalian promoters such as he p-acbn promoter 
Transcripts of the fusion protein gene orthe polynucleotide of the invention by higher eukaryotes can b » increased 
by inserting an enhancer sequence into the vector. Many enhancer sequences are now known from mammalian genes 
(glob n e astase, albumin, and insulin) or from eukaryotic cell virus (SV40. CMV). The disclosed vectors preferably 
also contain sequences necessary for accurate 3"end termination ; in eukaryotic cells, this would be a polyadenylation 
signal. In prokaryotic cells, this would be a transcription terminator. 
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10043] The recombinant DNA technologies used. for the construction of the vectors according to the invention are 
Le known and commonly used by persons skilled in the art. Standard techniques are used for *™S. »<** ° * 
DNA amplification and purification; the enzymatic reactions involving DNA ligase, DNA polymerase restriction endo- 
nucleases are carried out according to the manufacturer's recommendations. These techniques and others are gen- 

5 erally carried out according to Sambrook et al. (1989). 

[0044] The polynucleotide (in DNA or RNA form), the vector or fragments thereof may be .ntroduced Into the host 
cell by standard methods as described below. According to another embodiment of the invents, the fus.or .pro te.n 
can be directly introduced into the organism, or into a cell of the organism. This .ntroducfon can be earned out by 
injection into a tissue or an organ in the case of an organism, or by microinjection .n the case of a .cel. 

io [0045] The disclosed vectors can be used to transiently transfect or transform host cells, or can be .ntegrated into 
the host cell chromosome. Preferably, however, the vectors can include sequences that allow repl.cat.on of the vector 
and stable or semi-stable maintenance of the vector in the host cell. Many such sequences for use .n venous cells 
(that is. eukaryolic and prokaryotic cells) are known and their use in vectors routine. Generally. ,t s preferred that 
replication sequences known to function in host cells of interest be used. 

15 [0046] More preferably, the invention provides a gene-targeting vector comprising at least: 

a) A gene encoding a fusion protein or a polynucleotide of the invention; 

b) Optionally, a promoter that is operatively linked to said fusion protein gene or to said polynucleot.de; 
c Optionally, a DNA responsive element recognized by the DNA binding domain of said fusion prote.n ; 

20 d ) At least two SDRTS recognized by the recombinase domain of said fusion protein; 

e) At least one transposable DNA sequence of Interest, wherein said DNA sequence of interest is located between 
said two SDRTS sequences ; . 

f) Optionally, at least one marker gene; and wherein one of said SDRTS is located between sa.d fus.on prote.n 
encoding gene or said polynucleotide and said DNA sequence of interest. 
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[0047] In another embodiment, the invention provides gene targeting vector comprising at least: 



a) A gene encoding a fusion protein or a polynucleotide of the invention; 

b) Optionally, a promoter that is operatively linked to said fusion protein gene or to sa.d polynucleotide; 
30 c) At least one DNA responsive element recognized by the DNA binding domain of said fusion prote.n ; 

d) At least two SDRTS recognized by the recombinase domain of said fusion protein, sa.d SDRTS being joined 
covalently together and being separated at most by 1000 bp. 750 bp. 500 bp, 250 bp 200 bp. 150 bp 100 bp 75 
bp. 50 bp. 40 bp, 30 bp, 25 bp, 20 bp, 18 bp. 15 bp. 14 bp. 13 bp. 12 bp, 11 bp. 10 bp, 9 bp. 8 bp. 7 bp. 8 bp. 5 
bp. 4 bp, 3 bp, 2 bp, 1 bp. and more preferably at most by 20 bp. 
35 e) Optionally, at least one marker gene. 

[0048] In another embodiment, the invention provides a gene targeting vector comprising: 

a) Optionally, a DNA responsive element recognized bythe DNA binding domain of afusion protein of the invention ; 
40 b) At least two SDRTS recognized by the recombinase domain of said fusion protein of step a); 

c) Al least one transposable DNA sequence of interest, wherein said gene is located between said two SDRTS 
sequences ; 

d) Optionally, at least one marker gene. 

45 [0049] As used herein, "marker gene" means a gene that encodes a protein or a peptide (i.e. a selectable marker) 
hat a lows one to select for or against a molecule or a cell that contains it, often under particular conditions ..e. .n 
presence of a selective agent. These marker proteins include but are not limited to products wh,ch prov.de res.stance 
against otherwise toxic compounds (e.g.. antibiotics). For example, the ampicillin or the neomycin resistance genes 
constitute genes encoding for selection marker of the invention. Those selection markers can be either pos.t ve or 

50 negative {see Capecchi ef al., US 5 631 153). According to a preferred embodiment, said selection marker protein ,s 
a positive selection marker protein encoded by a gene selected In the group consisting of ant.biot.c resistance genes, 
hisD gene. Hypoxanthine phosphoribcsyl transferase (HPRT) gene, guanine-phosphorlbosyl-transferase (Gpt) gene^ 
Said antibiotic resistance gene is selected in the group consisting of hygromycin resistance genes, neomycin res.s ance 
genes, tetracyclin resistance genes, ampicillin resistance genes, kanamycin resistance genes, phleomycn resistance 

55 genes, bleomycin resistance genes, geneticin resistance genes, carbenicillin resistance gen es ch"ofamphen.Ml 
sistance genes, puromycin resistance genes. blasticidin-S-deaminase genes. In a preferred embodiment said antibi- 
otic resistancegene is a hygromycin resistance gene, preferably an Escherichia C o//hygromyc,n-B P h ° s P hol 7^^ 
(hpt) gene, in this case, the selective agent is hygromycin. In another preferred embodiment, said antibiotic resistance 
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gene is a neomycin resistance gene. Said neomycin resistance gene is chosen in respect to the cellular host in which 
the fusion protein is expressed . For an expression restricted to prokaryotic cells, the neomycin resistance gene encoded 
by the Tn1 0 transposon is preferred; in that case, the selective agent is kanamycin . More preferably, the Tn5 neomycin 
resistance gene is used; such gene allowing to perform the selection both in eukaryotic and procaryotc cells. In his 
latter case, the selective agent is G41 8. In another embodiment, the positive selection marker protein of the invention 
is His D- in that case, the selective agent is Histidinol. In another embodiment, the positive selection marker protein of 
the invention is Hypoxanthine phosphoribosyl transferase {HPRT) or Hypoxanthlne guanosyl phosphonbosyl trans- 
ferase (HGPRTV in that case, the selective agent is Hypoxanthine. In another embodiment, the posit.ve selection 
marker protein of the invention is guanine-phosphoribosyl-transferase (Gpt); in that case, the selective agent is xan- 
thine It is also in the scope of the invention to use negative selection marker proteins. For example, the genes encoding 
for such proteins are the HSV-TK gene ; in that case the selective agent is Acyclovir-Gancyclovir. For example, the 
genes encoding for such proteins are the Hypoxanthine phosphoribosyl transferase {HPRT) gene or the guanine- 
Phosphoribosyl-transferase (Gpt) gene ; in these cases, the selective agent is 6-Thioguanine. For example, the gene 
encoding for such proteins is the cytosine deaminase; in that case the selective agent is ^fluoro-cytosine Other ex- 
amples of negative selection marker proteins are the viral and bacterial toxins such as the diphtenc toxin A (DTA). The 
marker gene can also be a "reporter gene" By "reporter gene" is meant any gene which encodes a product whose 
expression is delectable. A reporter gene product may have one of the following attributes, without restriction : fluo- 
rescence <e.g., green fluorescent protein), enzymatic activity {e.g., lacZ or luciferase), or an ability to be specifically 
bound by a second molecule {e.g., biolin or an antibody-recognizable epitope). 

roOSO] The transposable DNA sequence of interest is any DNA molecule. It can either be a genom.c DNA fragment 
such as gene(s). intron(s). exon{s), regulatory sequence(s)or combinations or fragments thereof, or a recombinant 
DNA molecule such as a cDNA gene for instance. According to a preferred embodiment, the transposable DNA se- 
quence of interest is a therapeuticgene. A "therapeutic gene" is a gene that corrects or compensates for an underlying 
protein deficit or, alternately, that is capable of down-regulating a particular gene, or counteracting the negate* effects 
of its encoded product, in a given disease state orsyndrome. Moreover, a therapeuticgene can be a gene that mediates 
cell killing, for instance, in the gene therapy of cancer. According to another embodiment, the transposable DNA se- 
quence of interest is a reporter gene as previously defined. . 
[0051] The present invention also provides a kit to perform site-directed recombination ; such kit compnses at least : 

(i) a gene encoding a fusion protein, a polynucleotide, optionally a promoter that is operatively linked to said fusion 

protein gene or to said polynucleotide, a DNA cassette or a vector of the invention ; and 

(ID a donor DNA molecule that comprises a transposable DNA sequence of interest, said DNA sequence of interest 

being flanked at its 5' and/or 3' ends by said SDRTS, said SDRTS being specifically recognized by the recombinase 

domain of the fusion protein encoded by a gene, a polynucleotide, a DNA cassette, a vector of step (.) ; 

(iii) optionally, a recipient DNA molecule that comprises at least one DNA responsive element that binds the DNA- 

binding domain of the fusion protein encoded by a gene, a polynucleotide, a DNA cassette or a vector of {.). 

[0052] The invention also provides a kit to perform site-directed recombination wherein said kit comprises at least : 

(i) a fusion protein according to the invention ; and 

ii) a donor DNA molecule that comprises a transposable DNAsequence of interest, said DNA sequence of interest 
being flanked at its 5' and/or 3' ends by said SDRTS, said SDRTS being specifically recognized by the recombinase 
domain of the fusion protein of (i) ; , 

{iii) optionally, a recipient DNA molecule that comprises at least one DNA responsive element that binds the DNA- 
binding domain of the fusion protein of (i) . 

[0053] In another embodiment, the invention provides a kit to perform site-directed recombination wherein said kit 
comprises at least : 

(i) a gene targeting vector of the invention ; and 

(ii) optionally, a recipient DNA molecule that comprises at least one DNA responsive element that binds the DNA- 
binding domain of the fusion protein of (I). 

[0054] It is also a goal of the invention to provide a method form vitro site-directed DNA recombination, said method 
comprising the steps of combining : 

a) a donor DNA molecule that comprises a transposable DNA sequence of interest, the DNA sequence of interest 
being flanked at its 5' and/or 3' ends by said SDRTS being specifically recognized by the recombinase domain of 
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the fusion protein of the invention ; with , in , nn 

b) a recipient DNA molecule that comprises at least one DNA responsive element that b.nds the DNA-b.ndmg 

domain of the fusion protein of the invention ; with 

c) a fusion protein of the invention. 

[0055] In another embodiment, the invention provides a method for In vivo site-directed DNA recombination, said 
method comprising the steps of combining into a cell: 

a) a donor DNA molecule that comprises a transposable DNA sequence of interest the DNA ^sequence > of interest 
being flanked at its 5" and/or 3" ends by SDRTS being specifically recogmzed by the recombmase doma,n of the 
fusion protein of the invention ; with , ui^i™, 

b) a recipient DNA molecule that comprises at least one DNA responsive element that binds the DNA-b.nd.ng 

domain of the fusion protein of the invention ; with 

c) a fusio protein, or a gene encoding a fusion protein or a polynucleotide, optionally a promoter that ,s operative y 
Snked to said fusion protein gene or to said polynucleotide, or a DNA cassette, or a vector of the invention. sa>d 
gene or polynucleoUde being efficiently expressed in said cell. 

[0056] More preferably, the DNA recombination is selected among DNA insertion, deleUon. inversion, translocation 
[OoVny-trensposo^ 

strand of DNA which may be linear or may be a circularized plasmid. The transposable elements of the invents have 
SDRTS sequences at their extremities, and are able to site-specifically integrate into sites w h,n A. . second stand. 
SDRTS sequences of the present invention have preferably less than 100 bp, 80 bp. 75 bp, 50 bp 45 bp. 40 bp . 35 
bp 30 bp 25 bp M bp. 15 bp. 12 bp. 10 bp. B bp, .ess than 5 bp. Preferred transposable elements o^ the invention 
a short (i, less than twenty) base pair repeal at either end of the linear DNA However, SDRTS sequences 
lonoer man 50 b P are also in the scope of the invention. The SDRTS sequences of the invention are generally inverted 
repeate of o^e anoTer (exact or closely related). According to the present invention, preferred SDRTS include he 26 
bTS inverted repeat ends of IS30 or the appropriate SDRTS sequences of transposases and recomb.nases as 

S] '^'"recipient DNA molecule" is meant any DNA molecule. Preferably, this recipient DNA molecule is a cellular 
genome, such as bacterial chromosome(s), or eukaryolic chromosomes). Alternatively, this recp-ent DNA molecule 
I a vTrai genome. The recipient DNA molecule can be a naked DNA molecule or have a chromahn structure. The 
reciDient DNA molecule is either present in a living cell or in a cell free extract. 

S The invention also provides a method for the stable introduction of a DNA sequence of interest '"to at leas 
one recip^nt DNA molecule of a cell, said recipient DNA molecule comprises at least one DNA responsive element 
Z Si DNA-binding domain of the fusion protein of the invention, and said method comprising the steps of : 

a) providing a donor DNA mo.ecule that comprises a transposable DNA sequence of interest, ^ DI ^"J"»"» 
of interest being flanked at Its 5" and/or 3" ends by said SDRTS being specifically recognized by the recombmase 
domain of the fusion protein of the invention ; 

b) introducing into the cell said donor DNA molecule; and, 

c revious.y Simultaneously, or separately, introducing into said cell a fusion protein or a gene encoding a fusron 
protein or a polynucleotide, optionally a promoter that is operatively linked to said fus.on protein gene or to sa d 
XnucSe or a DNA cassette, or a vector, said gene or polynucleotide being efficiently expressed ,n sa,d cell. 

[0060] The targeting vector of the invention functions as a functional transposon er a ttansposable element by al- 
Eng the ransfornJon and/or the integration of at least a DNA sequence of interest Mo a prokaryohc or eukaryotic 
hoTcell qenome In a preferred embodiment, the vector or transposable element of the invention comprises a gene 
encode fus ton protein, two SDRTS sequence, a DNA sequence of interest, where the DNA sequence of interest 
to £»£ between the two SDRTS sequences, and a promoter that is adapted to cause the transenption of the fus on 
pr Sin TZl oral, spat.o-control.ed. Inducible or constitutive manner; where the fusion protein excises from the 
So"a fragment comprising the two SDRTS sequences and the portion of the vector between the wo SDRTS ^- 
- quences.-and to insert L excised fragment into a chromosome or a DNAsequence of he host cell. ItampnJ 
insures that the transposase gene is not incorporated into the target chromosome or the DNA sequence of he host 
e Tns ring hat the Lnsformation will be stable. Descendants of a cell transformed with the vector wiH no have a 
copy of the transposase gene. Without a transposase gene to encode a transposase, there will be nothing to promote 
excision of the DNA sequence of interest from the genome. 

poeT » ^ «*» a 9° al of the invenlion 10 pr0vide 3 h ° Sl Ce " transf0rmed by 81 ' eaSl 006 P 0, y nuc,eotlde ' ° ne 
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cassette, one vector of the invention. ,™«»#u a ikj 
[0062] A "host" as the term is used herein, includes prokaryatic or eukaryottc organisms that can be genetically 
engineered. Forexamples of such host cells, see Sambrook era,.. 1989. A host ceil of the ,nven ,on , severe prokary- 
olic cell or an eukaryotic cell derived from said organisms. The cell of the nvenbon .s chara * e "^ 
a the proteinfusion encoded by said polynucleotide, said DNA cassette, or said vector, ,s expressed ^fcr. Mogn^ 
active in said cell. By "biologically active", it is meant that the fusion protein of the invention, or a fragment or a variant 
thereof Is able to catalyse site-directed DNA recombination. 

SoM] ' in a preferred embodiment, the cell of the invention is se.ected among eukaryotte and prokaryotic cells. Pref- 
erably the cell of the invention is an eukaryotic cell selected among selected among mammal,an cells . (ag. _ human 
10 celfe, murine cells), yeast cells, amphibian cells, fish cells, Drosopbita cells. CmncrimU* Drib plant orito More 
preferab.y, it is a mammalan cell, more specifically selected among totipotent stem cells (e.g. embryomc s^ b ), 
mullipotent stem cells, fibroblast, cardiac muscle cells, skeletal muscle cells, glial cells, neurons In a preferred em- 

cell selected among Escherichia sp.. Baciliussp.. Campylobacter sp., "°^«" s *' ** m ^™2^l£^ 
75 coccus sp., ThsrmophBus sp., Morbizobium sp.. Rhizobium sp.. Neissana sp Neissene sp.. Pseudomonas ap My- 
cobacterium sp., Straptomyces sp., Corynabactarium sp., Lactobacillus sp.. Mteococcus sp, W»a sp B rucall* 
sp.. Boriadella sp., Proteus sp.. Klebs.ella sp.. Erwinia sp.. Vibrio sp.. Photorhabdus sp., desuH6v,bno sp., Listens sp.. 
Clostridium sp., Actynomyces sp., Haemophilus sp.. 

[0064] Hosf cells can be transformed with the disclosed polynucleotides, DNA cassette, or vectors of the invention 
using any suitable means and cultured in conventional nutrient media modified as is appropriate for inducing promoters^ 
selecting transformants of detecting expression. Suitable culture conditions for host cells, such as temperature and 
pH are well known. In a preferred embodiment, transformed cells of the invention are cultured ,n presence of selechve 
agent as previously described in the cell culture media. 

[0065] Such polynucleotides. DNA cassette, or vectors of the invention can be introduced into a host cell either m 
L o in vitro using known techniques depending of the nature of the host cells. DNA can e j^^"^"^ 
bacteria and yeast in a number of ways. These include conjugation, transformation, transduction, or the most recent 
development, electroporation. Transformation is the introduction of naked DNA into competent recipient cells. Com- 
petence may be either naturally or artificially, induced using calcium or rubidium chloride. Electroporabon uses pulses 
of high voltage electric current to enable uptake of DNA into cell protoplasts. In transduction, bacteriophages encap- 
sidate foreign DNA in place of their genome which is introduced into the recipient in a subsequent round of phage 
infection DNA can be introduced into eukaryotic cells, preferably into mammalian cells by us.ng known techniques 
such as CaPO< precipitation, electroporation, cationic lipofection, use of artificial viral envelopes, direct mjechon (e.g., 
^venous. intraperitoneal or intramuscular), and micro-injection into azygote or a pronucleus of a *jj£Tta» . J» 
invention relates to an isolated transgenic host cell transformed by a polynucleotides, DNA cassette or vectors of the 
invention; the protein fusion encoded by said polynucleotide, DNA cassette, or vector of the .nvenbon, is expressed 

and biologically active in said cell of the invention. „Hi™ »„ ih» 

[0066] The present invention also relates to the transgenic organism, compnsmg at least one cell according to the 
Invention. An "organism", as the term is used herein, includes but is not limited to. bacteria, yeast, animal, plants. 
Among the animals, one can designated mammals, such as rodent, primates, excepted humans, farm animals. In a 
preferred embodiment, the animal is a mouse, a rat, a Chinese pig, a hamster, a rabbit, a pig a cow, a horse, a goat, 
a sheep. In a preferred embodiment, the animal is a mouse. In another preferred embodiment, the organism ,s a yeast 
[0067] Therefore the invention relates to the use of a fusion protein, a polynucleobde, a vector a DNA cassette, or 
a kit of the invention, for producing transgenic cells and/or transgenic animals. Transgenic cells and animals are among 
the most useful research tools in the biological sciences. These cells and animals have an heterologous (lb. , foreign) 
gene or gene fragment, incorporated into their genome that is passed on to their offspring. Although there are several 
methods of producing transgenic animals, the transgenic cells and organisms of the invenbon are generally obtained 
by using microinjection of a polynucleotide, a vector, a DNA cassette of the invention into singte. cell embryos. These 
embryos are then transferred into pseudopregnant recipient foster mothers. The offspring are then greened for he 
presence of the new gene, or gene fragment. Potential applications for transgenic animals mclude discovering the 
genetic basis of human and animal diseases, generating disease resistance in humans and animals, gene therapy, 
drug testing, and production of improved agriculture, livestock. More specificaHy, the use of a fusion protein , . ^polynu- 
cleotide a vector a DNA cassette, a kit of the invention is useful to perform gene targetmg , gene knock-out <KO), gene 
knock-in (Kl), gene trapping (e.g. PolyA-trapping, exon trapping, promoter trapping, enhancer trapping), or transposon 

IS 9 ' By "gene targeting" is meant a process whereby a specific gene, or a fragment of that gene is altered. This 
altera ion of toe targeted gene may result in a change in the level of RNA or protein that ,s encoded by that gene or 
the alteration may result in the targeted gene encoding a different RNA or protein than the untargeted gene. The 
targeted gene may be studied in the context of a cell, or. more preferably, in the context of a transgenic animal. 
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[00691 By "Knock-out" is meant an alteration in the nucleic acid sequence that reduces the biological activity of the 
polypeptide normally encoded therafrom by at least B0% compared to the unaltered gene. The alteration may -be an 
Insertion, deletion, frameshift mutation, or missense mutation. Preferably, the alteration is an ,nsert.on or deletion, or 
is a frameshift mutation that creates a stop codon. ' 

[0070] By "Knock-in" is meant an alteration in the nucleic acid sequence by inserts of an additional exogenous 
DNA molecule that allows either an increased biological activity of the polypeptide normally encoded therefrom com- 
pared to the unaltered gene or the expression of an heterologous protein encoded by said [^^^"^f 
usually under the control of the regulatory elements of the endogenous gene encoding the polypeptide. The Knock- 
in" strategy may be used to insert a reporter gene {e.g.. green fluorescent protein), the expression of which is then 
^blStoZdentlcal regulatory controls that were placed on the gene that was replaced. This ,s useful for tagg.ng 
particular cell types or to facilitate studies of gene regulation. Alternatively, gene replacement may be used to assess 
the degree of functional redundancy among two related proteins orto examine the phenotype produced by replac.ng 

a normal protein with a mutated form. . „ 

[0071] "Gene-trapping" is an approach based on a class of reporter vectors, that have been deseed containing a 
splice-acceptor site upstream of a reporter gene (lacZ, gfp. neoR). Integration of these vectors into a genomic locus 
downstream of a functional promoter results in the generation of a fusion transcript between the endogenous gene 
and the reporter gene. Fusion transcripts from insertion of these vectors mimic endogenous gene expression at the 
insertion locus. Thte expression can be monitored by vfcualizing reporter gene activity (Gossler eta,., 1989). Usually 
the oene trap vectors will also act as insertional mutagens, disrupting the endogenous gene function.^ _ 
[0072] According to the integration locus, one can distinguish different gene-trap approaches named exon trapping , 
"promoter trapping", "enhancer trapping". 

[0073] For instance, enhancer trapping involves the random insertion into an eukaryouc genome of a promoter-less 
foreign gene {the reporter) whose expression can be detected at the cellular level. Expression of the reporter gene 
indicates that it has been fused to an active transcription unit or that it has inserted into the genome ,n prox.mity to ex- 
acting elements that promote transcription. This approach is important in identifying genes that are expressed in a cell 
type-specific or development stage-specific manner. „n«. ™h 

[0074] The above system can be a tool for the genetic modification of both prokaryotic and eukaryouc cells and 
organisms. The system allows site-specific integration of genetic information in whole genomes resulting ,n altered 
genotypes (gain of function mutation or loss of function mutation). Again of function mutation (e.g. gene therapy, knock- 
In or transgenesis) is achieved if the targeting vector contains a coding sequence for a given protein, or sequences 
that induce the transcriptional activity of a gene adjacent to the integration site in the recipient genome. ^ Joss of 
function genotype (e.g. insertional mutagenesis) is created when the targeted insertion result in the transcriptional or 
translauonal inaction of the gene adjacent to the integration site. The technology will allow efficient targeting of 
DNA sequences in genomes where other site-specific recombination or gene targeting technologies are not easibte 
[0075] It is proposed that eukaryolic systems (including commonly used models such as yeast, plants animal models 
such as C. elegans, Drosophila. teleostfish, Xenopus, as well as chick or mammalian systems) or any other cell systems 
in which the technology for the introduction of foreign DNA is available could utilize a site-targeted recombination 
system as described in this invention. Utilization of such a system is not restricted to in v,vo applications, but is ex- 
tendable to in vitm experimentation utilizing cell culture (e.g. embryonic stem cells). 

[0076] Accordingly, the methods, DNA molecules and vectors of the invention can be used fora variety of therapeu ic 
and diagnostic applications which require stable and efficient integration of transgene sequences into genomic DNA 
of cells. The methods, polynucleotides and vectors can be used to transform a wide variety of aukaryotic cells (e.g.. 
mammalian) cells and provide the advantage of high efficiency DNA transfer. 

[0077] Therefore, the present invention provides vector of the invention as medicament Such vector as a medica- 
ment is useful to perform a gene therapy to a patient in need of such treatment (e.g. cancer, metabolic diseases. 

infectious diseases, genetic diseases...). .„,,„«, Ic! „„„, 

[0078] Unless defined otherwise, all technical and scientific terms used herein have the same mean.ng as is com- 
monly understood by Dne Df the skill in the art to which this invenUon belongs. 

[0079] The figures and examples presented below are provided as further guide to the practitioner of ord.nary sk.ll 
in the art and are not to be construed as limiting the invention in anyway. 

EXAMPLES 

[0080] The inventors used the teleost fish zebrafish (Danio rerio) in their experiments as a vertebrate anirnal model 
to address the suitability of the targeted transposition system to vertebrate genetics. While transpose mediated re- 
combination systems have previously been shown to be active in fish (Raz ef at., 1996 ; Lam et at.. 1 996 ; Kawakam. 
et al 2000 ■ Izsvak et at.. 1995 ; Ivies etal., 1997), these systems either proved to be inefficient or were burdened by 
the restricted size of donor DNA that can be inserted by these 'cut and paste' transposases. In the present invention, 
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the inventors propose to use a transposition based recombination system which does not restrict the see of donor 
DNA to be inserted into the target genome. The IS30 mediated transpositions, integration w.ll be more efficient than 
the existing transposase based recombination systems. 

EXAMPLE 1 : CONSTRUCTION AND TEST OF A SIMPLE TRANSPOSITION REACTION IN ZEBRAFISH 
EMBRYOS 

[0081] In this pari of the project, evidence is provided that the prokaryolic IS30 element can induce transposition in 
a eukaryotic system. A test system based on an .S30-s P ecific transpositional reaction has been estabhshed In the 
system ^^ transposase gene is separated from its active site (the (,S30) 2 structure) and cloned in drfferent plasm,ds 
in order to test their activity. The system contained the following components: 

- Teeter plasmid: the plasmid contains two intermediate dimer inverted repeat elements OSSQh wrdri bye 
chloramphenicol resistance gene (CmR) but without an active transposase gene (Farkas ef a/., 1 99^ The ^trans- 
position reaction results in the specific loss of the CmR gene, which can be detected by rephca plat ng , <F,g_ 1) 

. Transposase producer construct: the open reading frame of the IS30 transposase was PCR ™P^; c '° n ^ 
into the vector pBluescriptSK (Stratagene) and sequenced in order to verify the i,n teg nty of 
order to place the IS30 transposase ORF underthe control of a well-charactensed eukaryotic promoter (CMV) the 
transposase ORF was cloned as a Xhol cassette into various pCS2 derivative plasmid vectors {Turner ef a/ 1994) 
Several IS30 transposase producer constructs have been developed where the IS30 transposase .s fused to signal 
pepudes to allow the detection of the transposase protein produced In fish embryos. Further so-m , of these con- 
structs carry a sequence encoding an eukaryotic nuclear localization signal peptide (NLS) .n order to introduce 
the synthesized protein into the cell nucleus: 



Transposase producer constructs 


Name 


Vector w*** rioReription 


IS30 

IS30+MT 

IS30+NLS 

IS30+NLSMT 


CS2 

CS2 + MT 
CS2 + NLS 
CS2 +NLSMT 


IS30 ORF alone 

IS30ORF N-terminal part was f used to the my c epitope tag (MT) in order to allow 

the immuno-histochemical detection of the fusion protein in fish cells 

IS30 ORF N-terminal part was fused to the nuclear localization signal (NLS) in 

order to transfer the synthesized protein to the nucleus 

IS30 ORF N-terminal part was fused both to the myc tag MT and the NLS 



[0082] The inventors have verified previously that the eukaryotic promoter CMV retained its activity m Eschenctoa 
colt (unpublished results), hence providing a possibility to test the transposition activity of the constructs descnbed 
above in bacteria. E. coli cells carrying the tester plasmid together with one of the transposase producer plasmids were 
selected for the kanamycin resistance marker (Km*) of the vector. The detection of transposition was based on the 
fot ofte Cm" markergene (CmS) as described previously (Farkas eta,., 1996). The results indicated that thefus.on 
proteins were active in transposition (Table II). 

Table I 



Transposition of the transposase producer constructs in E. 


CO// 




Constructions 


Total 
tested 


Rearranged 
products 


Transposition 


Deletion 


Other 


CmS 


% 


tester 

tester+lS30 
tester + NLS IS30 
tester+MT IS30 
tester* NLS MTIS30 


1986 
1324 
1546 
1765 
1878 


0 
23 
18 
14 
21 


<0.05 

1.74 

1.16 

0.79 

1.12 


20 
16 
14 
19 


2 
1 

1 


1 
1 



[0083] The frequency of transposition was relatively low compared to the values obtained by an efficient prokaryolic 
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promoter but the majority of the recovered recombined plasmids were products of transposition (specific loss of the 
Cm* gene). In the control experiment where only the tester was used . no transposition ^^^t^ 
was no significant difference between the activities of the transposase producer constructs (IS30. NLSIS30, MTIS30, 
NLSMTIS30) suggesting that the N-terminal fusion of peptides does not effect the activity of the transposase protein. 

EXAMPLE 2 : DETECTION OF TRANSPOSITIONAL DELETION GENERATED BY IS30 TRANSPOSASE IN 
ZEBRAFISH 

[0084] Zebrafish embryos were co-injected with the tester plasmids together with different transposase producers, 
tmmunohistochemicalstainingwas applied to detect the MYC epitope tagged *™^^.^ n .^Z^ 
Synthesis and localisation of the protein was demonstrated in nuclei as well as in the cytoplasm of cells n mRNA- 
injected zebrafish embryos (Fig. 2). It was concluded that the transposase protein was synthesized .n the celte m «va 
After 10 hours of development (neurula slage) total DNA was purified from the embryos and th.s DNA was used to 
transform E. coli cells in order to detect transposition of plasmid sequences in fish embryos. The same method was 
used as described aboveand in Farkasef a/. (1996). The results indicated that IS30 type transposition reaction occurred 
in fish embryos (Table III). 

Table III : 



Transpositional deletion by IS30 transposase in zebrafish 



Constructions 


total 


rearranged 


rearranged products 




tested 


products 












CmS 


% 


transposition 


deletion 


other 


control non injected 


0 


0 






1 




tester 


1949 


2 


0.1 


1 




tester + IS30 


1680 


15 


0.89 


12 


2 


1 


tester + NILS IS30 


1718 


14 


0.81 


10 


1 


3 


tester + NLSMTIS30 


516 


4 


0.78 


4 







[0085] The reaction catalyzed by the transposase in embryos resulted in the specific loss of the marker gene there- 
fore the transformed bacteria exhibited the kanamycin resistance (Km*), but not the chloramphenicol res.stance (Cm ) 
marker. When the tester alone was injected into the embryos (negative control), the frequency of the observed rear- 
ranqements was low (0.1 %). In contrast, the frequency of recombinations was significantly higher (app 8 times dif- 
ference) when one of the transposase producer constructs was co-injected. Moreover, the majority of the recovered 
plasmids contained the expected transpositional deletion product (Fig. 1), which was confirmed by sequencing. The 
remaining recovered isolates were products of deletions or multiple rearrangements. All of these results confirm tha 
the IS30 element was active in zebrafish. There was no significant difference between the activities of the d.fferent 
transposase producer constructs used. ,,—,„ dm a inuitm 

[0086] The transposase producer constructs contain a promoterto allowthe m wf/osynthes.s of IS30 mRNA. in vttro 
transcribed IS30 mRNAs were injected into the fish embryos. To test activity of the transposase in zebrafish embryos 
total DNA was isolated from injected embryos and toss or the Cm* marker gene from the tester plasm.d was assayed 
aftertransformalion of E. co//as described previously. The results indicated thatthe injected I S30 transposase mRNA 
was translated and correctly excised the Cm* marker gene in fish embryos at a relatively high frequency (Table IV). 

Table IV : 



Transpositional deletion by IS30 transposase injected as mRNA in zebrafish 


Constructions 


total 
tested 


rearranged 
products 


rearranged products 






CmS 


% 


transposition 


deletion 


other 


control non injected 
tester 

tester + IS30mRNS 
tester +NLS IS30 mRNA 
tester + MT IS30 mRNA 


0 

1848 
907 
787 
38 


0 
1 
3 
13 
2 


0.05 
0.33 
1.65 
5.2 


2 

! 11 

2 


1 
1 


1 
1 
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Transpositional deletion by IS30 transposase injected as mRNA in zebrafish 


Constructions 


total 
tested 


rearranged 
products 


rearranged products 


CmS 


% 


transposition 


deletion 


other 


tester + NLSMTIS30 rnRNA 


282 


3 


1.06 


3 







[0087] The embryos injected with the tester alone yielded very low frequences of recomb.ned P rodu ^ ^^), 
however, when the transposase was also co-injected as mRNA the frequency of recombination was M^tarty - 
creased by up to 100-fold (0.33-5.2 %). Altogether the results indicated that m vitro synthes.sed mRNA can be used 
to produce active transposase in the fish embryo. 

EXAMPLE 3 : TRANSPOSITIONAL INSERTIONS BY IS30 TRANSPOSASE IN ZEBRAFISH 

[0088] The above described assay is based on a specific reaction (characteristic for IS30) i.e. on *e site-specffic 
deletion of DNA fragments flanked by dimerized inverted repeats (IS30) 2 by the IS3- transposase In orie to detect 
an i egration read resulting in insertion of donor sequences into a recipient DNA. the prerequisrte for tra sgene 
Integration induced by IS30 in the zebrafish genome, a second set of experiments were performed. To detect trans- 
positional integration reactions induced by IS30, the following experimental setup was developed. 

3.1. Constructs of a gene trap- insertion system 

[0089] A gene trap construct has been generated which contains a gfp reporter gene attached to a splice-acceptor 
sequence to allow the expression of the marker gene only in the case of a gene-trap event i.e. when the gfp gene ,s 
inserted into an intronic sequence of a gene (Skarnes. 1990). The integration of the gene trap construe nto a tran- 
scribed region of the fish genomic DNA is expected to be facilitated by IS30 transposase which ^^"f^^fte 
dimer intermediate structure of the donor DNA. The recombination of the gene trap donor and an intromc target results 
in the expression of the gfp reporter gene, allowing the rapid detection of insertion into a genomic gene^ In order to 
detect the integration of a marker gene into a genomic gene In the zebrafish embryo a test system was developed 
where the intronic integration in a given gene is modeled by using a target plasmid which conta.ns an intact genom.c 
gene (see schematic in Fig. 3). 

[0090] The test system contains the following elements: 

- mRNA encoding the IS30 transposase gene is in vitro transcribed from a plasmid containing the SP6 promoter 
and serves as source of the transposase when translated in the cell. ,„„.,.,„ 

. gfp gene-trap oWharbors the joined IS30 ends ((/S30j 2 ) and a marker gene encoding green fluorescent protein 
. (gfp). Thema*ergeneisnotexpectedtot*expres^^ 
The reporter gene was placed downstream of sequences containing a splice-acceptor site of the carp p-act.n irst 
intron to create a gene trap construct. Thus, the gfp reporter gene can be activated by genom.c promoters/en- 
hancers which consequently, provides an in vivo assay for monitoring the integration. IS30 transposase requires 
a ctolar doTor tempLe J the integration reactbn. Therefore the donor DNA can be an intact circular plasmid 
{GFP plasmid donor) alternatively, the donor fragment containing the gene trap construct can be exceed from the 
plasmid and religated to generate a circular template (circular GFP-fg donor). 

3.1.1. transpositional target : 

[0091] Plasmids containing the frequently used integration sites of IS30 (hot spot. Olasz ef al 1997a; Olasz etal 
998) were used. Two types of hot spot sequences were chosen: (i) the consensus sequence of the frequent y used 
integration sites (called GOHS) and (ii) the (IS30) Z structures itself. These hot spot sequences were cloned in the first 
in ranXl site, nucleotide posk 

in the plasmids GOHS-shh target, and (IS30) 2 -shh target. This target plasmid contains the transcnpt.onayegulato^ 
elements of the shh gene and therefore, is expected to activate the gfp gene of the gene4rap donor when the donor 
sequences are integrated into the first intron of the shh gene. 
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3.1.2. Testing of the gene trap insertion system : 

[0092] The transpositional target construct was successfully tested in E. coli where insertion of the gfp donor was 
detected in the 60HS hotspot of the GOHS-shh target The resulting hybrid plasmid contain.ng the gene-trap cassette 
with the splice-acceptor site and gfp ma^^ 

r Z ebrafish (shh-gfp hybrid)- The shh-gfp hybrid plasmid was injected into zebraf.sh embryos and expression of gfp 
mSSi m£i of development (Prim 6 stage), g*> activity was found in notochord cells (the cel. type which 
normally expresses the sAh gene) in 8 out of 30 injected embryos, indicating that the gfp marker has^ 
under the control of the shh regulatory e.ements (Fig. 4). Therefore, the gfp gene inserted .nto the first intron of toe 
shh iocus functions as a marker of a bona Ms gene-trap event. Non-specific activty of gfp was also seen due to a 
background activity of the gfp marker probably under the control of cryptic intron.c promoter-enhancer elements pro- 
vided by the carp P-actin intronic sequences adjacent to the splice-acceptor site (Fig. 4C, D). 

3.2. insertion of gene trap donor into the shh target plasmid induced by the IS30 transposase in fish embryos 

[0093] The circular GFP fragment donor and the GOHS-shh target were Injected ^ ^.^ e " 

LbLon with synthetic mRNA encoding the IS30 transposase (see schematic in Figure 3). When the ISSOJans 
posase mediates integration of the gfp-donor into the shh-target in fish embryos an increase m «M neural tube 
and notochord cells expressing gfp is expected as a result of a gene trap event Notochord « Jj* 
gene were detected in the embryos (4/130 embryos) demonstrating that a gene-trap event in the shh locus has been 
indeed induced by the IS30 transposase (Fig. 5C. D). No expression In the notochord was detected when the target 
and donor DNA were co-injected without IS30 transposase (Fig. 5B). 

[0094] integration can be detected by PCR amplification of characteristic junction fragments between donor and 
Lrget sequences. Embryos injected atthe one cell stage with combinations of gfp fragment donor or the plasnvd GFP 
donor and the GOHS-shh target plasmid with or without IS30 transposase mRNA were harvested after 10 hours of 
development and total DNA was prepared. PCR reactions were carried out using the prepared DNA as template and 
using oligonucleotide primers as shown in Fig. 6A. to detect junction fragments between donor and target plasmids 
following transposon mediated recombination in the fish embryos. The expectedjunctionfragmenlindicat^eof msertion 
of the circular GFP-fg donor or the plasmid GFP donor into the GOHS hot spot site of the shh locus has only been 
detected in experiments where IS30 mRNA was co-injected, indicating that the transposition was carried out correctly 
by the transposase (Table V). 



Table V : 



Transposition of gfp donor DNAs on target plasmids in zebrafish embryos 




sense orientation junction 


inverse orientation junction 




fg- 




I. 


Construct injected 


5'end 


3* end 


5'end 


3'end 


1. circular GFP-fg donor 










2. plasmid GFP donor 










3. circular GFP-fg donor 


+ 


+ 


+ 


+ 


+ GOHS-shh target 










+ IS30 mRNA 










4. circular GFP-fg donor 










+ GOHS-s/jfJ target 










5. plasmid GFP donor 


+ 


+ 


+ 


+ 


+ GOHS-shh target 










+ IS30 mRNA 










6. plasmid GFP donor 










+ GOHS-sM target 


I 









F00951 The resulting junction fragments were tested further by restriction digestion confirming the correct .denWication 
Fig. BB and D). Insertions in both orientations wera detected by PCR (Fig. 6D). Expression of gfp was mom tared in 
L injected embryos at one day of development No specific activity was seen ,n the notochord or floor plate cells, 
suggesting that the transposition has taken place in a low number of plasmids. below the threshold level required for 
deSSue^pecific gfp expression. The detection of junction fragments between the target and donor sequences 
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together with the in vivo detection of tissue-specifc gfp expression demonstrate that transpositional integration gener- 
ated by the IS30 transposase has taken place in the fish embryos. 

EXAMPLE 4 : SITE-SPECIFIC IS30 INDUCED TRANSPOSITION BY AN ISM TRANSPOSASE-CI LAMBDA 
REPRESSOR FUSION PROTEIN IN E. COU 

[0096] The efficiency and accuracy of the transposition could be enhanced by modification of transposases^The 
nven tors tested whether the integration frequency could be increased and/or the s.te of .nlegrauon can be altera I by 
linking the transposase to a well-defined DNA binding protein. The well-characterised CI repressor of lambda phage 
was chosen as a targeting protein. A fusion protein (SEQ ID N° 2) was generated containing the C repressor attached 
to the IS30 transposase in experiments to define the frequency and target specificity of transposit.on. 
[0097] DNAs containing the ORFs of IS30 transposase and that of the lambda CI repressor were PCR amphfied 
equenced and fused with a Pstl linker. The C-terminal part of the IS30 transposase ^^"T^jf J£ 
corresponding N-terminal part of the full length lambda repressor protein. The two parts of the fus.on prote .were 
SSSS X unrelated' amino acids (Leu. Gin). The ORF was cloned in a P ACY 

and Cohen, 978). which contained the (IS30) 2 intermediate structure. The inventors have .^^^P 1 *™ 
retained the activities of both of the iransposase and of the CI repressor. Bactena conta,n,ng the us.o i prod uct were 
immune against lambda phage infection at 30" C while those containing the IS30 transposase only (w.thoul the ther- 
Tensitive CI repressor) could be infected by lambda phage. On the other hand, the ' * 
the fusion protein was not affected by the CI fusion as compared to the wild type transposase with the GOHS hot target 

( [009B] B) in order to verify that the lambda repressor "domain" in the fusion protein can direct the insertion of the trans- 
posab e element, the lambda operator region (containing operators OR1. OR2 and OR3) was PCR amplified as an 
app. 200 bp fragment, cloned and used as target site in a set of experiments (for schemata representauon of the 

[oraT^he^naJsirof the target specificity of insertions mediated by the fusion protein was also P^^Jhe 
results confirmed that thefusion protein has a directing activity as the majority of insertions were ocated .n the 200-400 
bp surroundings of the operator region (Fig. 7C). Moreover, no insertions were found in a control experiment where 
the same plasmid (pEMBL19, Dente etal. 1983) but lacking the lambda operator sites was used as target. 
[0100] The analysis of the sequences of the insertion sites revealed that only a limited homology was present com- 
pared to the consensus sequence for IS30 hot spots (Fig. 8. bold letters in the consensus). There was no signtont 
homology between the insertion sites compared to the consensus sequence of the operator sites of the bacteriophage 
lambda These experiments, clearly demonstrate that the 1S30 transposase-CI lambda repressor fus.on pmte.n effi- 
ciently directed transpositional insertions into the proximity of the target. 

EXAMPLE 5 : IDENTIFICATION OF THE TRANSPOSASE DOMAIN RESPONSIBLE FOR TARGET RECOGNITION 
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IN IS30 



r0101l The E coli resident mobile element IS30 is characterised by a pronounced target specificity (Olasz ef a/., 
998) Upon transposition, the element frequently inserts exactly into the same position of a preferred target sequence. 

insertion sites in phages, plasmids and in the genome of £. coli ate characterised by an exceptionally long pal.ndrom.c 
onsensus sequence that provides strong specificity for IS30 insertions despite a relatively high .eve of degeneracy 

This 24 bp long region alone determines the attractiveness of the target DNA and the exact posiuon of IS30 mediated 

insertion. This type of target specificity is called: recognition of "natural" hot spots. 

[0102] It was ah* verified that the inverted repeats {IR) at both ends of the IS30 element also serve as a target for 
S30 insertion (Olasz el al 1997a). Not only the single IRs but also the (IS30) 2 intermediate served as a preferred 
target site. This kind of target specificity is different to that of "recognition of natural hot spots and hence called rec- 

raioT On m^basifof this dual target specificity ("natural" and "tfThot spots) it becomes possible to determine the 
protein domain of the IS30 transposase responsible for the recognition of "natural" target by means of mutational 
analysis. Namely, mutations in the domain responsible for the recognition of "natural" hotspots have no. orhave l.mrted 
effect on the recognition of "IR" hot spots. 

[0104] Analysis of the domain structure of the IS30transposase compared withthatof IS30 related elemente revealed 
hat there are 3 regions, which could be responsible for the target specificity of the element. Therefore we focused on 
the mutational analysis of these regions. One of these regions was located exactly at the beg.nn.ng of the protein (1-39 
aa). The coding region of these amino acids was deleted and the deletion variant fnsposa* , was . used in ^ transpo- 
sitional fusion assay (Fig. 9). The wild type transposase served as a control. Both natural (LHS and GOHS) and IR 
(single IR and (IS30) 2 intermediate) hot spots were offered as target sites (Table VI). 
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Table VI 
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ThP N-terminal part of the IS30 transposase is responsible for recognition of the "natural" target sites 




Frequency of transposition 


Offered target 


Wild type IS30 


■M-terminal deletion of IS3D 


natural hot spots 






LHS 
GOHS 


8.0x10-4 
1.6 x10' 3 


<1.3x 10" e 
<1.2x10" 5 


B, "IR" hot spots 






IRL - single IR 
(!S30) 2 - intermediate 


1.7 x10' 3 
3.5 x10" 3 


1.3x10- 3 
1.5 x 10' 3 



10105] The deletion of the first 39 amino acids resulted in the loss of recognition of the "natural" hot spot sequences, 
while otherfunctions of the transposase remained intact (e.g. the recognition of IR hot spots); Ita mutant ransposase 
was still active in transposition {Table V). Therefore it is concluded, that the first 39 ammo acids cf the S30 tansp°sas 8 
contain the DNA binding domain (DBD) of the transposase responsible for recognition of the target DMA Baaed on 
this result it is possible to exchange the target recognition DBD domain of IS30 with a hetem ogous DNA-b^d,ng 
domain (e.g. histones. regulatory proteins, recombination enzymes. DNA modifying enzymes etc. Jhus, these ex- 
periments clearly show that the development of transposases with altered target specificity is possible. 

EXAMPLE 6 : TARGETED INSERTION BY IS30 IN VERTEBRATE CELLS (FISH EMBRYOS) 

[0106] Using a similar strategy to what has been described forthe prokaryotic targeting system a fusion protein was 
generated and tested for targeting activity in a plasmid system in fish cells. The inventors asked whether an IS30 
protein fused to a DNAbinding domain would activate insertions in the proximity of a DNA binding recogniUon sequence 
on a plasmid system. The DNA binding domain (aa 772 - 1106) of the Gli1 transcription factor of the anc finger DNA 
binding protein family (Kinzleref a/., 1988) was fused to the IS30 coding sequences atthe C term,™ .end of IS30 (SEQ 
ID N° 4) The resulting fusion protein was expressed in the pCS2 + expression vector. As target DNA for the transpo- 
sition, a plasmid containing the shh locus was used, with the exception , that the GOHS target ^^JM 
by the recognition sequence of Oil (GACCACCCA) (Kinzler et a/., 1990 ; Sasak, af a/ 1997). 
circular gfp fragment was used as described in Fig. 3. The scheme of the targeting expenment in fish is descnbec hn 
Fig 10 mRNA encoding the IS30^HDBD fusion protein was co-injected into zebrafish 1 cell sto 0* **" ^ 

with gfp circ fg donor and the shh(gli) target. As control, injections were carried out without the mRNA or shh target 
without the OH binding site. DNA was prepared from gastrula stage embryos and PCR react.ons were earned out o 
detect junction fragments between shh target sequences and IS end of donor DNA using P r ' mere J^™^^ 
Fig 10 and Fig 11. The primers S3 and IRL were used to detect Insertions in the relative proximity of the GII1 DNA 
binding site This PCR reaction is designed to detect only the insertion of the gfp donorfg in the sense orientation. The 
expected size of the inserted fragment in case of an insertion close to the Glil binding site is expected to be approxi- 
mately 780 base pairs. PCR reactions using the S3 and IRL oligonucleotides resulted in no amphheabon ,n the control 
injected embryos (target and donor DNAs without IS30^liDBD RNA. or IS30-gBDBD mRNA and gfp donor fg w.th 
S^hout Gill binding site). A smear of bands was detected in the PCR reaction where template DNA _was used 
from embryos injected with all three elements of the transposition system (data not shown). The resulting PCR products 
were shot gun cloned into a PCR cloning vector (pTOPO, Promega) and a series of plasmids were grown containing 
different insertion fragments ranging from 500-1500 bps. Clones were picked randomly ^^^f 5 ^ 
were obtained representing 16 different clones. In 7 cases a junction fragment between shh and circ gfp fg donor was 
detected by sequence homology search (bestflt, GCG WISCONSIN PACKAGE, Version 1 0.2) The junction structure 
of the 7 clones is demonstrated in Fig. 11. In one case (clone #27) a characteristic junction fragmen was detected 
indicative of a transposition event The IS dimer structure was resolved at the junction between the (IS30) 2 . In th.s 
clone, the junction of the IS end to the shh target sequences has occurred atthe position 3995. only 36 bases away 
from the 3' end of the Gli1 DNA binding site, supporting the notion of a targeting event. Further 4 clones cpnteined 
junction fragments very close to the GH1 binding site (20-40 bases from the position of G from the GACCACCCA 
sequence at 3976). In these cases the (IS30) 2 dimer was not resolved, suggesting that the insertion took place by 
illegitimate recombination, and that the fusion protein did not catalyse the resolution of the dimer structure. Since the 
insertion events were clustered close to the GH1 DNA binding site, it is argued that the recombination occurred by 
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involvement of binding of the IS30-gli1DBD fusion protein to the GI11 binding site. It is to be studied further whether 
the lack of proper transposition is due to steric hindrance of the fusion protein, or other reasons. 
[0107] A batch of injected embryos was grown to 24 h and expression of the gfp reporter was .nvestigated. Sim, arly 
to previous gene trap experiments, expression of the gfp donor is expected to be present J ^ "otochord and floor 
plate cells only in the case of integration on the shh sequences in the intronic regions (Fig. 12). The results of tne gtp 
expression analysis are shown in Table VII. 



Table VII : 



10 


Expression of the donor derived gfp gene in notochord cells of zebrafish embryos injected with shh-gli 

target and the 1530-fliDBD mRNA 




Treatment 


totals of embryos 


# of Gfp+ cells 


# of Gfp+ Notochord cells 


15 


IS30-GU1DBD mRNA+ 
gfp fg donor + 
pshh(GIH) target 


104 


161 


10 


20 


IS30-GH1DBD mRNA+ 
gfp fg donor+ 
pshh target 


102 


152 


0 




gfp fg donor* 
pshh(GM) target 


17B 


262 


0 
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[0108] The above results indicate that the gfp donor was expressed in the notochord only in the expen mental group 
where all three elements of the transposition system were injected. This result supports the hypothesis that the 
IS3Q-GH1 DBD fusion protein catalyses insertion into the shh intronic regions. This result together with the analysis of 
the junction fragments by PCR provides evidence for the ability of IS30-GH1DBD to catalyse insertion of donor DNA 
sequences into the vicinity of a DNA binding site on the donor DNA. Further experiments are requited to measure the 
frequency of the targeting reactions among the total number of recombinations detectable in the plasm.d recombmabon 
system in thB fish embryos. 

EXAMPLE 7 : GENOMIC INTEGRATION OF DNA SEQUENCES INDUCED BY IS30 TRANSPOSITION 

[01091 Genomic integration of DNA fragments induced by the IS30 transposase will be tested experimentally as an 
important element of this invention. It is proposed, that the frequency of genomic integration of transgene ^ruOs 
can be enhanced by using transgene donor DNA that contains the inverted repeat dimer structure of IS30 (IS30) 2 and 
subsequent co-injection of the donor DNA together with IS30 transpose mRNA in the early embryos (see schemata 
in Fig 13). Experiments are under way to detect genomic integration of donor DNA in the zebrafish genome, using a 
shh gfp expression cassette flanked by the (IS30) 2 intermediate structure. The frequency of integration of the expres- 
sion cassette will be estimated by the frequency of the transmission of the transgene into the F1 generation by the 
founder P0 parents. F1 embryos will be assayed by DNA detection techniques (PCR. Southern blot hybridization and 
by analysis of gfp activity. Comparisons will be made between groups injected with the donor construct and S30 
transposase. those injected with the donor construct alone, and those Injected with transposase and donor construct 
not containing the (IS30) 2 dimer structure. 

EXAMPLE 8 : REPLACEMENT OF THE TARGET RECOGNITION DOMAIN OF IS30 TRANSPOSASE 

F01101 The first N-terminal 39 aa of the IS30 transposase was replaced with different parts of the cl repressor of 
bacteriophage X - resulting in a hybrid fusion protein . Beside the wild type IS30 transposase, three fusion prote.ns were 
constructed, all containing the IS30 transposase domain (40-383 aa) and the DBD domain of the cl repressor but 
differing in their length : 

- ADBD-IS30 : IS30 transposase lacking the first 39 aa ; 

- HTH-lambdaCI-IS30 : contains only the helix-tum-helix DBD domain of bacteriophage lambda (39 aa) ; 
. N-terminal-lambda-IS30 : the first N-terminal 92 aa of the cl repressor was fused to IS30 ; 

- Full-lambdaCI-IS30 : the whole cl repressor was fused to the IS30 transposase. 
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r0111l The plasmids used in further experiments contained both the ORF of the fusion proteins and the (IS30) 2 
Intermediate structure, but did not carry any IS30 hotspct orthe operator region of bacteriophage X The f ^on proteins 
were expressed from the promoter present in the (IS30) 2 intermediate structure (Dalrymple, 1987). The activrty of he 
protein was tested in E. coli using a deletion assay. In this experimental setup two kinds of target can be used by the 
IS30 iransposase resulting in deletion formation : the IR ends of the IS30 element and heterologous sequences rep- 
resenting the new, gained target specificity of the element. An appropriate E. coli strain was transformed wrth the 
plasmids selecting for the antibiotic resistance marker of the vector. Single colonies where chosen and were , grown In 
liquid media for 8 generations. Afterwards, plasmid DNA was purified and the structure was analysed with restr.ct.on 
endonucleases. 



Table VIII : 



Distribution of the rearrangements mediated by fusion proteins 



Fusion protein 



Total tested 



No rearrangement 



Target used IR 
sequence 



heterologous 
sequence 



wild type IS30 

ADBD-IS30 

HTH-lambdaCI-IS30 

N-terminal- 

lambdacl-lS30 

Full-lambdaCI-IS30 



29 
20 
40 
40 

40 



20 
20 
1 
3 

1 



6 
0 

22 
37 

22 



3 
0 
17 
0 

17 



[0112] XCI-IS30 fusion proteins were found to be more active than the wild type and the truncated IS30 transposase. 
interestingly, the N-terminal-lambdaCI-tS30 fusion protein was only active in the recognition of the IR hot spots, wh. e 
both the HTH-lambdaCl-1330 and Full-lambdaCI-IS30 fusion also recognised heterologous sequences. These resul s 
suggest that the replacement of the first 39 aa of the transposase by a more defined DBD domain can also result in 

lull^r ^Tord^tTanfllyse the heterologous sites used in the experiment, the deletion junctions were sequenced 
Fia 14) The sequence analysis confirmed thatthe fusion protein HTH-lambdaCI-IS30 recognises hot spot sequences, 
while two sites were used twice {bs480 and bs577). It is important to note, that the consensus sequence derived from 
7 independent experiments show significant homology to the lambda operator sequence. 

[0114] All of these results suggest that the replacement of the target recognition domain of IS30 transposase by a 
heterologous DBD domain can provide a new specificity to the fusion protein. 

FIGURES 



40 



45 



50 



Figure 1. 

[01151 Schematic representation of the deletion reaction generated by the IS30 transposase between plasmids in 
fish embryos The inverted triangles represent the inverted repeats of the IS30 mobile element that serve as recognition 
sequences for IS30 on donor constructs. Abbreviations: Ap R , ampicillin resistance. Km* kanamyc.n resistance, Cm 
chloramphenicol resistance, 1S30, IS30 transposase ORF. 

Figure 2. 

[0116] Detection of the mosaically distributed IS30 transposase protein fused to the myc epitope tag by imrnunohis- 
tochemical staining in zebraf.sh embryos injected with a CMV; IS30 DNA construct. Arrowheads pom k > cells of the 
gastrulating zebrafish embryo that contain the fusion protein in their nuclei (upper panel) and in the cytoplasm (lower 
panel). 



55 



Figure 3. 

[01 1 7] Schematic representation of the experimental design used to detect transpositional insertion in target plasmids 
in zebrafish embryos. Abbreviations and symbols are as in Fig. 1 or: s P 6, sp6 promoter, intA, carp pectin gene first 
intron with splice acceptor, GOHS. GOHS hot spot of IS30 transposition, Km* kanamyc.n resistance gene. Numbers 
in target and transpositional fusion construct demonstrate the location of the shh exons on the shh locus which is 
represented by a thick black line. 
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Figure 4. 



[0118] Detection of gfp expression in one-day-old zebrafish embryos that contain the shh-gfp hybrid construct as 
described in Fig. 3. The hybrid construct contains the green fluoresced protein reporter gene under the control of the 
sonic hedgehog transcriptional regulatory elements resulting in expression of gfp in the notochord cells (A.B) Ectop.c 
activity in cells where shh is not expressed normally (C, epidermis. D. muscle fibres) has also been detected. 



Figure S. 



[0119] Insertion of a gfp donorfragment in the shh target locus induced by IS30 transposase results in tissue specific 
expression of the gfp reporter gene in zebrafish embryos. A, Tail region of a control non-injected embryo. B, Embryo 
injected with the circular GFP-fg donor and GOHS-shh target showing non-specific activity in epidermal cells (arrow- 
heads) C-D, Embryos injected with GFP-fg donor and GOHS-shh target together with IS30 transposase express gfp 
in the notochord cells (arrows) suggesting regulation of the gfp gene by the shh locus. Abbreviations; n, notochord. 
ye, yolk extension. Embryos are oriented anterior left except for B where anterior is to the right. Lateral views to the 
tail region. 

Figure 6. 

[01201 Detection of junction fragments between the gfp donor and shh target sequences by PCR. 
r0121] A Schematic representation of the expected junction fragments amplified by PCR. Arrows below the sche- 
matic representation of the DNA indicate oligonucleotide primers used in the PCR reactions. Expected sizes of PCR 
products are indicated at the 5' and 3' junctions of recombined DNA fragments. B. PCR reactions demonstrating the 
presence of junction fragments in fish embryos between micro-injected circular gfp-fg donorand the GOHS-shh target 

?0122] dS C Control PCR reaction to detect the presence of the original donor and target molecules in DNA purified 
from micro-injected fish embryos. D. PCR reaction to detect the junction fragments produced by recombinabon of the 
circular gfp-fg donor and GOHS shh-target in different orientations. Abbreviations: GFP-fg: circular grp-fg donor, GFP- 
p- gfp plasmid donor, GOHS-shh: GOHS-shh target plasmid, El: Eco Rl digested PCR product, m: molecular weight 
marker, pc: shh-gfp hybrid plasmid control. 5': junction fragment generated at the 5' end of inserted donor. 3 1 : junction 
fragment generated at the 3" end of inserted donor. 

Figure 7. 

[0123] A. Schematic representation of the transposition system based on transposase fusion proteins. B. Results 
of transposition experiments using different target sequences and fusion variants of IS30. C. The insertion sites of the 
transposase producer donor in the target plasmid. Numbers and triangles represent sites and number of integration 
events detected in these particular sites. Abbreviations as before, and: tac: tac promoter, TcR, tetracycline resistance 
gene. 

> Figure 8. 

[0124] Analysis of the insertions mediated by the IS30 transposase-Cl lambda repressor fusion protein. On the right 
panel the position of integration is presented according to the P EMBL19 sequence. The site of target duplication in 
the sequences (2 bp in the middle) are separated from the flanking regions by space. Bold letters in the consensus 
represent matches to the CIG consensus sequence for IS30 target sites (Olaszeta/. 1998). w, randy in the consensus 
represent A or T, purin and pyrimidine bases, respectively. . 

Figure 9. 

[0125] Schematic representation of the transpositional integraton system used to test the efficiency of a deletion 
variant of IS30. Abbreviations as before. 

Figure 10. 

[0126] Schematic representation of the transposition system used to detect targeted transposition between plasmids 
in fish embryos. Abbreviations and symbols as in Fig. 4 with the following exceptions : yellow bar in shh-(GI>1) target 
and shh-gfp hybrid fusion product represents the insertion of a GACCACCCA sequence starting in the position 3976 
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of the shh genomic locus in intron I. Arrows represent annealing positions of S3 and IRL oligonucleotides used in PCR 
reactions to detect junction fragments. Red bar represents aa 772 to 1106 of the mouse GHI1 zinc finger domains fused 
to the IS30 transposase. 



Figure 11. 

[0127] Analysis of the fusion fragments detected by PCR between shh and IS end of gfp donor sequences. Red bars 
with triangles represent IS end sequences, green bars represent shh^gJi donor sequences. Yellow bars represent the 
GH1 DNA binding site (GACCACCCA sequence). Numbers above the junction fragments demonstrate the position of 
integration of IS end containing donor sequences to the shh-gli target DNA. Black bars demonstrate the positions 
where S3 and IRL oligonucleotides anneal to target sequences in the PCR reactions. 



Figure 12. 

[0128] Expression of gfp was detected in embryos injected with all three components of the IS30 transposition sys- 
tem. Experimental conditions and abbreviations are as described in Fig. 4. Details of gfp expression analysis are 
described in Table VI. 

Figure 13. 

[0129] Schematic representation of the transpositional integration system used in zebrafish to enhance transgene 
integration efficiency. Abbreviations: shh, shh promoter and cis regulatory elements. GFP, gfp gene. 

Figure 14. 

[0130] Analysis of the deletions mediated by the IS30 transposase-CI lambda repressor fusion protein. The site of 
target duplication in the sequences (2 bp in the middle - TD) are separated from the flanking regions by a space. Boxes 
in the consensus represent matches to the lambda operator. R and M in the consensus represent A or T. A or C bases, 
respectively. 
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SEQUENCE LISTING 

<110> Association pour le developpement de la recherche en genctique moleculaire 
5 - ADEREGEM 

<120> Site-directed recombinase fusion proteins and corresponding 
polynucleotides, vectors and kits, and their uses for site-di rented DNA 
recombination 

10 <130> D19376 

<160> 6 

<170> Patentln version 3.0 

15 <21Q> 1 

<211> 1937 

<212> DNA 

<213> Artificial sequence 



20 



<220> 

<223> Is30-ci repressor fusion protein 



25 



30 



35 



40 



45 



50 



55 



<220> 

<221> CDS 

<222> (63} . . (1934) 

<4 00> 1 

tgtagattca attggtcaac gcaacagtta tgtgaaaaca tggggttgcg gaggtttttt 60 

ga atg aga cga aca ttt aca gca gag gaa aaa gcc tct gtt ttt gaa 107 
Met Arg Arg Thr Phe Thr Ala Glu Glu Lys Ala 5er Val Phe Glu 



1 



5 10 15 



eta tgg aag aac gga aca ggc ttc agt gaa ata gcg aat ate ctg ggt 155 

Leu Trp Lys Asn Gly Thr Gly Phe Ser Glu He Ala Asn He Leu Gly 

20 * 25 30 

tea aaa ccc gga acg ate ttc act atg tta agg gat act ggc ggc ata 203 

Ser Lys Pro Gly Thr He Phe Thr Met Leu Arg Asp Thr Gly Gly He 

35 4 0 45 

aaa ccc cat gag cgt aag egg get gta get cac ctg aca ctg tct gag 

Lys Pro His Glu Arg Lys Arg Ala Val Ala His Leu Thr Leu Ser Glu 

50 55 60 

cgc gag gag ata cga get ggt ttg tea gcc aaa atg age att cgt gcg 299 

Arg Glu Glu He Arg Ala Gly Leu Ser Ala Lys Met Ser He Arg Ala 

65 70 75 



ata get act gcg ctg aat cgc agt cct teg acg ate tea cgt gaa gtt 
He Ala Thr Ala Leu Asn Arg Ser Pro Ser Thr He Ser Arg Glu Val 
60 85 90 95 



251 



347 



cag cgt aat egg ggc aga cgc tat tac aaa get gtt gat get aat aac 395 
Gin Arg Asn Arg Gly Arg Arg Tyr Tyr Lys Ala Val Asp Ala Asn Asn 
100 105 110 

cga gcc aac aga atg gcg aaa agg cca aaa ccg tgc tta ctg gat caa 443 
Arg Ala Asn Arg Met Ala Lys Arg Pro Lys Pro Cys Leu Leu Asp Gin 
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15 



20 



25 



50 



55 



115 120 125 

aat tta cca ttg cga aag ctt gtt ctg gaa aag ctg gag atg aaa tgg 
Asn Leu Pro Leu Arg Lys Leu Val Leu Glu Lys Leu Glu Met Lys Trp 



130 



135 1^0 



tct cca gag caa ata tea gga tgg tta agg cga aca aaa cca cgt caa 
Ser Pro Glu Gin lie Ser Gly Trp Leu Arg Arg Thr Lys Pro Arg Gin 



160 



165 170 175 



cgt age cgt gaa gcg eta cac cac ctg aat ata cag cat ctg cga egg 
Arq Ser Arg Glu Ala Leu His His Leu Asn He Gin His Leu Arg Arg 
180 185 190 

teg cat age ctt cgc cat ggc agg cgt cat acc cgc aaa ggc gaa aga 
Ser His Ser Leu Arg His Gly Arg Arg His Thr Arg Lys Gly Glu Arg 
195 200 205 

qgt acg att aac ata gtg aac gga aca cca att cac gaa cgt tec cga 
Gly Thr lie Asn He Val Asn Gly Thr Pro He His Glu Arg Ser Arg 
210 215 220 

aat ate gat aac aga cgc tct eta ggg cat tgg gag ggc gat tta gtc 
Asn lie Asp Asn Arg Arg Ser Leu Gly His Trp Glu Gly Asp Leu Val 
225 230 235 

tea ggt aca aaa aac tct cat ata gec aca ctt gta gac cga aaa tea 
Ser Gly Thr Lys Asn Ser His He Ala Thr Leu Val Asp Arg Lys Ser 
30 2 40 * 245 250 255 

cgt tat acg ate ate ctt aga etc agg ggc aaa gat tct gtc tea gta 
Arg Tyr Thr He He Leu Arg Leu Arg Gly Lys Asp Ser Val Ser Val 
260 265 270 



325 330 335 



tac ttt cct aaa aag aca tgt ctt gee caa tat act caa cat gaa eta 
Tyr Phe Pro Lys Lys Thr Cys Leu Ala Gin Tyr Thr Gin His Glu Leu 

345 350 



34 0 



gat ctg gtt get get cag eta aac aac aga ccg aga aag aca ctg aag 
Asp Leu Val Ala Ala Gin Leu Asn Asn Arg Pro Arg Lys Thr Leu Lys 
355 360 365 



491 



539 



145 ISO 155 

10 aaa acg ctg cga ata tea cct gag aca att tat aaa acg ctg tac ttt 587 

Lys Thr Leu Arg He Ser Pro Glu Thr He Tyr Lys Thr Leu Tyr Phe 



635 



683 



731 



779 



827 



875 



923 



35 aat cag get ctt acc gac aaa ttc ctg agt tta ccg tea gaa etc aga 

Asn Gin Ala Leu Thr Asp Lys Phe Leu Ser Leu Pro Ser Glu Leu Arg 
275 280 285 

aaa tea ctg aca tgg gac aga gga atg gaa ctg gec aga cat eta gaa 

Lys Ser Leu Thr Trp Asp Arg Gly Met Glu Leu Ala Arg His Leu Glu 
40 2 9 0 " 295 300 

ttt act gtc age acc ggc gtt aaa gtt tac ttc tgc gat cct cag agt 

Phe Thr Val Ser Thr Gly Val Lys Val Tyr Phe Cys Asp Pro Gin Ser 

305 310 . 315 

45 cct tgg cag egg gga aca aat gag aac aca aat ggg eta att egg cag 1067 

Pro Trp Gin Arg Gly Thr Asn Glu Asn Thr Asn Gly Leu He Arg Gin 
32Q 



971 



1019 



1115 



1163 



24 
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10 



15 



20 



30 



35 



40 



45 



50 



55 



ttc aaa aca ccg aaa gag ata att gaa agg ggt gtt gca ttg aca gat 1211 
Phe Lys Thr Pro Lys Glu He He Glu Arg Gly Val Ala Leu Thr Asp 
370 ' 375 380 

ctg cag atg age aca aaa aag aaa cca tta aca caa gag cag ctt gag 1259 
Leu Gin Met Ser Thr Lys Lys Lys Pro Leu Thr Gin Glu Gin Leu Glu 
385 390 395 

gac gca cgt cgc ctt aaa gca att tat gaa aaa aag aaa aat gaa ctt 1307 
Asp Ala Arg Arg Leu Lys Ala He Tyr Glu Lys Lys Lys Asn Glu Leu 
400 405 410 415 

ggc tta tec cag gaa tct gtc gca gac aag atg ggg atg ggg cag tea 1355 
Gly Leu Ser Gin Glu Ser Val Ala Asp Lys Met Gly Met Gly Gin Ser 
420 425 430 

ggc gtt ggt get tta ttt aat ggc ate aat gca tta aat get tat aac 1403 
Gly Val Gly Ala Leu Phe Asn Gly He Asn Ala Leu Asn Ala Tyr Asn 
435 440 445 



gec gca ttg ctt gca aaa att etc aaa gtt age gtt gaa gaa ttt age 
Ala Ala Leu Leu Ala Lys lie Leu Lys Val Ser Val Glu Glu Phe Ser 
450 455 460 



cag ccg tea ctt aga agt gag tat gag tac cct gtt ttt tct cat gtt 
Gin Pro Ser Leu Arg Ser Glu Tyr Glu Tyr Pro Val Phe Ser His Val 
485 490 495 



480 



1451 



cct tea ate gee aga gaa ate tac gag atg tat gaa gcg gtt agt atg 1499 
Pro Ser He Ala Arg Glu He Tyr Glu Met Tyr Glu Ala Val Ser Met 
465 470 475 



1547 



1595 



1643 



cag gca ggg atg ttc tea cct gag ctt aga acc ttt acc aaa ggt gat 
Gin Ala Gly Met Phe Ser Pro Glu Leu Arg Thr Phe Thr Lys Gly Asp 
500 505 510 

gcg gag aga tgg gta age aca acc aaa aaa gec agt gat tct gca ttc 
Ala Glu Arg Trp Val Ser Thr Thr Lys Lys Ala Ser Asp Ser Ala Phe 
515 520 525 

tgg ctt gag gtt gaa gqt aat tec atg acc gca cca aca ggc tec aag 1691 
Trp Leu Glu Val Glu Gly Asn Ser Met Thr Ala Pro Thr Gly Ser Lys 
530 535 540 

cca age ttt cct gac gga atg tta att etc gtt gac cct gag cag get 1739 
Pro Ser Phe Pro Asp Gly Met Leu He Leu Val Asp Pro Glu Gin Ala 
545 550 555 

gtt gag cca ggt gat ttc tgc ata gec aga ctt ggg ggt gat gag ttt 1787 
Val Glu Pro Gly A3p Phe Cys He Ala Arg Leu Gly Gly Asp Glu Phe 
560 J 565 570 575 

acc ttc aag aaa ctg ate agg gat age ggt cag gtg ttt tta caa cca 
Thr Phe Lys Lys Leu lie Arg Asp Ser Gly Gin Val Phe Leu Gin Pro 
5B0 585 590 

eta aac cca cag tac cca atg ate cca tgc aat gag agt tgt tec gtt 
Leu Asn Pro Gin Tyr Pro Met Tie Pro Cys Asn Glu Ser Cys Ser Val 
595 600 605 



1835 



1883 



25 
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qtg ggg aaa gtt ate get agt cag tgg cCt gaa gag acg ttt ggc tga 1931 
Val Gly Lys Val He Ala Ser Gin Trp Pro Glu Glu Thr Phe Gly 
610 615 620 



w 



15 



20 



25 



30 



35 



45 



50 



55 



tct aga 
Ser 



<210> 2 

<211> 622 

<212> PRT 

<213> Artificial sequence 
<220> 

<223> Is30-ci repressor fusion protein 

<400> 2 

Met Arg Arg Thr Phe Thr Ala Glu Glu Lys Ala Ser Val Phe Glu Leu 
i 5 1° 15 



Trp Lys Asn Gly Thr Gly Phe Ser Glu He Ala Asn He Leu Gly Ser 
20 25 30 



Lys Pro Gly Thr He Phe Thr Met Leu Arg Asp Thr Gly Gly He Lys 
35 40 45 



Pro His Glu Arg Lys Arq Ala Val Ala His Leu Thr Leu Ser Glu Arg 
50 55 60 

Glu Glu He Arg Ala Gly Leu Ser Ala Lys Met Ser He Arg Ala lie 
65 70 75 80 

Ala Thr Ala Leu Asn Arg Ser Pro Ser Thr He Ser Arg Glu Val Gin 
85 90 95 



Arg Asn Arg Gly Arg Arg Tyr Tyr Lys Ala Val Asp Ala Asn Asn Arg 
40 100 105 HO 

Ala Asn Arg Met Ala Lys Arg Pro Lys Pro Cys Leu Leu Asp Gin Asn 
115 120 125 



Leu Pro Leu Arg Lys Leu Val Leu Glu Lys Leu Glu Met Lys Trp Ser 
130 135 140 

Pro Glu Gin He Ser Gly Trp Leu Arg Arg Thr Lys Pro Arg Gin Lys 
145 150 155 160 

Thr Leu Arg He Ser Pro Glu Thr lie Tyr Lys Thr Leu Tyr- Phe Axg 
165 170 175 



1937 
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Ser Ara Glu Ala Leu His His Leu Asn He Gin His Leu Arg Arg Ser 
180 165 190 

His Ser Leu Arg His Gly Arg Arg His Thr Arg Lys Gly Glu Arg Gly 
195 200 205 



10 



Thr He Asn He Val Asn Gly Thr Pro He His Glu Arg Ser Arg Asn 
210 215 7.20 



15 



He Asp Asn Arg Arg Ser Leu Gly His Trp Glu Gly Asp Leu Val Ser 



225 



230 



235 



Gly Thr Lys Asn Ser His He Ala Thr Leu Val Asp Arg Lys Ser Arg 



24 5 



250 



20 



Tvr Thr He lie Leu Arg Leu Arg Gly Lys Asp Ser Val Ser Val Asn 

^ c 270 



260 



265 



25 



Gin Ala Leu Thr Asp Lys Phe Leu Ser Leu Pro Ser Glu Leu Arg Lys 
275 280 285 



30 



Ser Leu Thr Trp Asp Arg Gly Met Glu Leu Ala Arg His Leu Glu Phe 

290 295 300 

Thr Val Ser Thr Gly Val Lys Val Tyr Phe Cys Asp Pro Gin Ser Pro 
305 310 315 320 



35 



Trp Gin Arg Gly Thr Asn Glu Asn Thr Asn Gly Leu He Arg Gin Tyr 
325 330 3 35 



40 



Phe Pro Lys Lys Thr Cys Leu Ala Gin Tyx Thr Gin His Glu Leu Asp 
340 345 350 

Leu Val Ala Ala Gin Leu Asn Asn Arg Pro Arg Lys Thr Leu Lys Phe 
355 360 365 



45 



Lys Thr Pro Lys Glu He He Glu Arg Gly Val Ala Leu Thr Asp Leu 
370 375 380 



50 



Gin Met Ser Thr Lys Lys Lys Pro Leu Thr Gin Glu Gin Leu Glu Asp 
390 395 ^00 



385 



Ala Arg Arg Leu Lys Ala He Tyr Glu Lys Lys Lys Asn Glu Leu Gly 
405 410 415 



55 
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Leu Ser Gin Glu Ser Val Ala Asp Lys Met Gly Met Gly Gin Ser Gly 
420 425 430 

Val Gly Ala Leu Phe Asn Gly He Asn Ala Leu Asn Ala Tyr Asn Ala 
435 440 445 

Ala Leu Leu Ala Lys He Leu Lys Val Ser Val Glu Glu Phe Ser Pro 
450 455 460 

Ser He Ala Arg Glu lie Tyr Glu Met Tyr Glu Ala Val Ser Met Gin 

Pro Ser Leu Arg Ser Glu Tyr Glu Tyr Pro Val Phe Ser His Val Gin 
485 490 495 

Ala Gly Met Phe Ser Pro Glu Leu Arg Thr Phe Thr Lys Gly Asp Ala 
500 505 510 

Glu Arg Trp Val Ser Thr Thr Lys Lys Ala Ser Asp Ser Ala Phe Trp 
515 520 525 

Leu Glu Val Glu Gly Asn Ser Met Thr Ala Pro Thr Gly Ser Lys Pro 
530 535 540 

Ser Phe Pro Asp Gly Met Leu He Leu Val Asp Pro Glu Gin Ala Val 
545 550 555 560 

Glu Pro Gly Asp Phe Cys He Ala Arg Leu Gly Gly Asp Glu Phe Thr 
565 570 575 



Phe Lys Lys Leu He Arg Asp Ser Gly . Gin Val Phe Leu Gin Pro Leu 
580 565 590 

Asn Pro Gin Tyr Pro Met He Pro Cys Asn Glu Ser Cys Ser Val Val 
595 600 605 

Gly Lys Val He Ala Ser Gin Trp Pro Glu Glu Thr Phe Gly 
610 615 620 



<210> 3 

<211> 1650 

<212> DHA 

<213> Artificial sequence 
<220> 

<221> CDS 

<222> (1)..(165Q) 
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20 



25 



30 



40 



50 ' 55 60 



85 



90 95 



cgt aat egg ggc aga cgc tat tac aaa get gtt gat get aat aac cga 
Arg Asa Arg Gly Arg Arg Tyr Tyr Lys Ala Val Asp Ala Asn Asn Arg 
100 105 HO 



gec aac aga atg gcg aaa agg cca aaa ccg tgc tta ctg gat caa aat 
Ala Asn Arg Met Ala Lys Arg Pro Lys Pro Cys Leu Leu Asp Gin Asn 



1X5 



120 125 



35 130 



tta cca ttg cga aag ctt gtt ctg gaa aag ctg gag atg aaa tgg tct 
Leu Pro Leu Arg Lys Leu Val Leu Glu Lys Leu Glu Met Lys Trp Ser 

135 140 



cca gag caa ata tea gga tgg tta agg cga aca aaa cca cgt caa aaa 
Pro Glu Gin He Ser Gly Trp Leu Arg Arg Thr Lys Pro . Arg Gin Lys 
150 155. 160 



145 



48 



96 



<223> IS30-gli 

<40C> 3 , . 4 _ rta=9 ,„.„ 

5 atq aga cga aca ttt aca gca gag gaa aaa gee tct gtt ttt gaa uLa 

Met Arg Arg Thr Phe Thr Ala Glu Glu Lys Ala Ser Val Phe Glu Leu 

1 J 5 10 15 

tgg aag aac gga aca ggc ttc agt gaa ata gcg aat ate ctg ggt tea 

Trp Lys Asn Gly Thr Gly Phe Ser Glu lie Ala Asn He Leu Gly Ser 
10 20 25 30 

aaa ccc gga acg ate ttc act atg tta agg gat act ggc ggc ata aaa 14 4 

Lys Pro Gly Thr He Phe Thr Met Leu Arg Asp Thr Gly Gly He Lys 
35 40 45 

15 ccc cat gag cgt aag egg get gta get cac ctg aca ctg tct gag cgc 192 

Pro His Glu Arg Lys Arg Ala Val Ala His Leu Thr Leu Ser Glu Arg 



gag gag ata cga get ggt ttg tea gee aaa atg age att cgt gcg ata 240 

Glu Glu He Arg Ala Gly Leu Ser Ala Lys Met Ser He Arg Ala He 
65 70 75 80 

get act gcg ctg aat cgc agt cct teg acg ate tea cgt gaa gtt cag 288 

Ala Thr Ala Leu Asn Arg Ser Pro Ser Thr He Ser Arg Glu Val Gin 



336 



384 



432 



480 



acg ctg cga ata tea cct gag aca att tat aaa acg ctg tac ttt cgt 528 

Thr Leu Arg He Ser Pro Glu Thr He Tyr Lys Thr Leu Tyr Phe Arg 

165 170 175 

age cgt gaa gcg eta cac cac ctg aat ata cag cat ctg cga egg teg 576 

Ser Arg Glu Ala Leu His His Leu Asn He Gin His Leu Arg Arg Ser 

45 ' 18Q 185 190 

cat age ctt cgc cat ggc agg cgt cat acc cgc aaa ggc gaa aga ggt 624 

His Ser Leu Arg His Gly Arg Arg His Thr Arg Lys Gly Glu Arg Gly 

195 200 205 

50 acg att aac ata gtg aac gga aca cca att cac gaa cgt tec cga aat 672 

Thr He Asn He Val Asn Gly Thr Pro He His Glu Arg Ser Arg Asn 

210 215 220 

ate gat aac aga cgc tct eta ggg cat tgg gag ggc gat tta gtc tea 720 

55 
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10 



15 



20 



45 



50 



55 



He Asp Asn Arg Arg Ser Leu Gly His Trp Glu Gly Asp Leu Val Ser 
225 230 235 240 

got aca aaa aac tct cat ata gcc aca ctt gta gac cga aaa tea cgt 
Glv Thr Lys Asn Ser His He Ala Thr Leu Val Asp Arg Lys Ser Arg 
245 250 255 

tat acg ate ate ctt aga etc agg ggc aaa gat tct gtc tea gta aat 
Tvr Thr lie He Leu Arg Leu Arg Gly Lys Asp Ser Val Ser Val Asn 



260 265 270 

cag get ctt ace gac aaa ttc ctg agt tta ccg tea gaa etc aga aaa 864 

Gin Ala Leu Thr Asp Lys Phe Leu Ser Leu Pro Ser Glu Leu Arg Lys 

275 280 285 



305 



310 ^ 315 320 



385 



390 ^ 395 400 



gaa ttt gac tec caa gag cag ctg gtg cac cac ate aac age gag cac 
40 Glu Phe Asp Ser Gin Glu Gin Leu Val His His Tie Asn Ser Glu His 

405 410 415 



76B 



816 



tea ctg aca tgg gac aga gga atg gaa ctg gcc aga cat eta gaa ttt 912 

Ser Leu Thr Trp Asp Arg Gly Met Glu Leu Ala Arg His Leu Glu Phe 
290 295 300 

act gtc age acc ggc gtt aaa gtt tac ttc tgc gat cct cag agt cct 

Thr Val Ser Thr Gly Val. Lys Val Tyr Phe Cys Asp Pro Gin Ser Pro 



960 



1008 



1056 



1104 



tgg cag egg gga aca aat gag aac aca aat ggg eta att egg cag tac 

Trp Gin Arg Gly Thr Asn Glu Asn Thr Asn Gly Leu He Arg Gin Tyr 
325 330 335 

25 t tt cct aaa aag aca tgt ctt gcc caa tat act caa cat gaa eta gat 

Phe Pro Lys Lys Thr Cys Leu Ala Gin Tyr Thr Gin His Glu Leu Asp 

340 345 350 

ctg gtt get get cag eta aac aac aga ccg aga aag aca ctg aag ttc 

Leu Val Ala Ala Gin Leu Asn Asn Arq Pro Arg Lys Thr Leu Lys Phe 

30 355 360 365 

aaa aca ccg aaa gag ata att gaa agg ggt gtt gca ttg aca gat gaa 1152 

Lys Thr Pro Lys Glu He He Glu Arg Gly Val Ala Leu Thr Asp Glu 

370 375 380 

35 ttc gaa tct atg tat gaa act gac tgc cgt tgg gat ggc tgc age cag 1200 

Phe Glu Ser Met Tyr Glu Thr Asp Cys Arg Trp Asp Gly Cys Ser Gin 

nnc A A n 



1248 



ate cac ggg gag egg aag gag ttc gtg tgc cac tgg ggg ggc tgc tec 1296 

He His Gly Glu Arg Lys Glu Phe Val Cys His Trp Gly Gly Cys Ser 

420 425 430 

agg gag ctg agg ccc ttc aaa gee cag tac atg ctg gtg gtt cac atg 1344 

Arg Glu Leu Arg Pro Phe Lys Ala Gin Tyr Met Leu Val Val His Met 

435 440 445 

cgc aga cac act ggc gag aag cca cac aag tgc acg ttt gaa ggg tgc 

Arg Arg His Thr Gly Glu Lys Pro His Lys Cys Thr Phe Glu Gly Cys 

450 455 460 

egg aag tea tac tea cgc etc gaa aac ctg aag acg cac ctg egg tea 

Axg Lys Ser Tyr Ser Arg Leu Glu Asn Leu Lys Thr His Leu Arg Ser 



1392 



1440 
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A-m 475 480 

465 470 q 

ra( . aCQ a at qag aag cca tac atg tgt gag cac gag ggc tgc agt aaa 

His Thr Gly Glu 1*2 Pro Tyr Met Cys Gl« His Glu Gly Cys Ser Lys 

5 " 485 490 49 -> 

gcc ttc age aat gcc agt gac cga gec aag cac cag aat egg acc cat 
Ala Phe Ser Aan Ala Ser Asp Axg Ala Lys His Gin Asn Arg Thr His 
500 505 510 

™ tec aat gag aag ccg tat gta tgt aag etc cct ggc tgc acc aaa cgc 

Ser Asn Slu Lys Pro Tyr Val Cys Lys Leu Pro Gly Cys Thr Lys Arg 
515 520 525 

tat aca gat cct age teg ctg ega aaa cat gte aag aea gtg cat ggt 
Tyr Thr Asp Pro Ser Ser Leu Arg Lys His Val Lys Thr Val His Gly 
55 530 535 540 

cct gac gcc act agt taa 
Pro Asp Ala Thr Ser 
545 



20 



25 



30 



35 



40 



45 



50 



55 



<210> 4 

<211> 549 

<212> PRT 

<213> Artificial sequence 

<400> 4 

Met Arg Arg Thr Phe Thr Ala Glu Glu Lys Ala Ser Val Phe Glu Leu 
1 5 10 15 



Trp Lys Asn Gly Thr Gly Phe Ser Glu lie Ala Asn He Leu Gly Ser 
20 25 30 

Lvs Pxo Gly Thr He Phe Thr Met Leu Arg Asp Thr Gly Gly He Lys 
35 40 45 

Pro Hi s Glu Arg Lys Arg Ala Val Ala His Leu Thr Leu Sex Glu Arg 
50 " 55 60 

Glu Glu He Arg Ala Gly Leu Ser Ala Lys Met Ser He Arg Ala lie 
65 7 0 75 

Ala Thr Ala Leu Asn Arg Ser Pro Ser Thr lie Ser Arg Glu Val Gin 
85 90 95 



Arg Asn Arg Gly Arg Arg Tyr Tyr Lys Ala Val Asp Ala Asn Asn Arg 
100 IO 5 110 

Ala Asn Arg Met Ala Lys Arg Pro Lys Pro Cys Leu Leu Asp Gin Asn 
115 120 125 



1488 



1536 



1584 



1632 



1650 
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Leu Pro 

130 



Leu Arg Lys Leu Val Leu Glu Lys Leu Glu Met Lys Trp Ser 
135 "0 



Pro Glu 
145 



Gin He Ser Gly Trp Leu Arg Arg Thr Lys Pro Arg Gin Lys 
150 155 160 



Thx Leu Arg He Ser Pro Glu Thr He Tyr Lys Thr Leu Tyr Phe Arg 
165 170 



Ser Arg Glu Ala Leu His His Leu Asn He Gin His Leu Arg Arg Sex 
180 



190 



His Ser Leu Arg His Gly Arg Arg His Thr Arg Lys Gly Glu Arg Gly 

200 205 



195 



Thr lie Asn lie Val Asn Gly Thr Pro He His Glu Arg Ser Arg Asn 
210 215 220 

He Asp Asn Arg Arg Ser Leu Gly His Trp Glu Gly Asp Leu Val Ser 
225 230 235 

Gly Thr Lys Asn Ser His He Ala Thr Leu Val Asp Arg Lys Ser Arg 
245 250 "a 



Tyr Thr He He Leu Arg Leu Arg Gly Lys Asp Ser Val Ser Val Asn 

265 ''- ' 0 



260 



270 



Gin Ala Leu Thr Asp Lys Phe Leu Ser Leu Pro Ser Glu Leu Arg Lys 
275 280 265 

Ser Leu Thr Trp Asp Arg Gly Met Glu Leu Ala Arg His Leu Glu Phe 
290 295 300 



Thr Val Ser Thr Gly Val Lys Val Tyr Phe Cys Asp Pro Gin Ser Pro 
305 310 315 320 



Trp Gin Arg Gly Thr Asn Glu Asn Thr Asn Gly Leu He Arg Gin Tyr 



32 5 



330 



Phe Pro Lys Lys Thr Cys Leu Ala Gin Tyr Thr Gin His Glu Leu Asp 
340 345 350 



Leu Val Ala Ala Gin Leu Asn Asn Arg Pro Arg Lys Thr Leu Lys Phe 
355 • 360 365 
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Lys Thr Pro Lys Glu He lie Glu Arg Gly Val Ala Leu Thr Asp Glu 
370 375 380 



Phe Glu Ser Met Tyr Glu Thr Asp Cys Arg Trp Asp Gly Cys Ser Gin 
3B5 390 395 400 



Glu Phe Asp Ser Gin Glu Gin Leu Val His His lie Asn Ser Glu His 
405 410 415 



lie His Gly Glu Arg Lys Glu Phe Val Cys His Trp Gly Gly Cys Ser 
420 425 430 



Arg Glu Leu Arg Pro Phe Lys Ala Gin Tyr Met Leu Val Val his Met 
435 440 445 



Arg Arg His Thr Gly Glu. Lys Pro His Lys Cys Thx Phe Glu Gly Cys 
450 455 460 



Arg Lys Ser Tyr Ser Arg Leu Glu Asn Leu Lys Thr His Leu Axg Ser 
465 " 470 475 480 



His Thr Gly Glu Lys Pro Tyr Met cys Glu His Glu Gly Cys Ser Lys 
485 490 495 



Ala Phe Ser Asn Ala Ser Asp Arg Ala Lys His Gin Asn Arg Thr His 
500 505 510 



Ser Asn Glu Lys Pro Tyr Val Cys Lys Leu Pro Gly Cys Thr Lys Arg 
515 520 525 



Tyr Thr Asp Pro Ser Ser Leu Arg Lys His Val Lys Thr Val His Gly 
530 535 540 



Pro Asp Ala Thr Ser 
545 



<210> 5 

<211> 1785 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> IS-tet fusion protein 



<220> 

<221> CDS 

<222> (I)., (17Q5) 
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10 



15 



20 



50 



ccc cat gag cgt aag egg get gta get cac ctg aca ctg tct gag cgc 

Pro His Glu Arg T.ys Arg Ala Val Ala His Leu Thr Leu Ser Glu Arg 
50 " 55 60 

gag gag ata cga get ggt ttg tea gee aaa atg age att cgt gcg ata 

Glu Glu lie Arg Ala Gly Leu Ser Ala Lys Met Ser lie Arg Ala lie 

65 70 75 80 

get act gcg ctg aat cgc agt cct teg acg ate tea cgt gaa gtt cag 

Ala Thr Ala Leu Asn Arg Ser Pro Ser Thr He Ser Arg Glu Val Gin 
85 90 95 



gec aac aga atg gcg aaa agg cca aaa ccg tgc tta ctg gat caa aat 
Ala Asn Arg Met Ala Lys Arg Pro Lys Pro Cys Leu Leu Asp Gin Asn 
115 ~ 120 125 



48 



96 



<400> 5 

atg aga cga aca ttt aca gca gag gaa aaa gec tct grt ttt gaa eta 

Met Arg Arg Thr Phe Thr Ala Glu Glu Lys Ala Ser Val Phe Glu Leu 

1 5 .10 15 

tgg aag aac gga aca ggc ttc agt gaa ata gcg aat ate ctg ggt tea 

Trp Lys Asn Gly Thr Gly Phe Ser Glu He Ala Asn He Leu Gly Ser 

20 " 25 30 

aaa ccc gga acg ate ttc act atg tta agg gat act ggc ggc ata aaa 144 

Lys Pro Gly Thr He Phe Thr Met Leu Arg Asp Thr Gly Gly lie Lys 

35 40 45 



192 



240 



28e 



cgt aat egg ggc aga cgc tat tac aaa get gtt gat get aat aac cga 336 
Arg Asn Arg Gly Arg Arg Tyr Tyr Lys Ala Val Asp Ala Asn Asn Arg 
25 100 105 110 



384 



30 tta cca ttg cga aag ctt gtt ctg gaa aag ctg gag atg aaa tgg tct 432 

Leu Pro Leu Arg Lys Leu Val Leu Glu Lys Leu Glu Met Lys Trp Ser 

130 135 140 

cca gag caa ata tea gga tgg tta agg cga aca aaa cca cgt caa aaa 4 80 

35 Pro Glu Gin He Ser Gly Trp Leu Arg Arg Thr Lys Pro Arg Gin Lys 

145 150 155 160 

acg ctg cga ata tea cct gag aca att tat aaa acg ctg tac ttt cgt 528 

Thr Leu Arg He Ser Pro Glu Thr He Tyr Lys Thr Leu Tyr Phe Arq 
165 170 175 

40 

age cgt gaa gcg eta cac cac ctg aat ata cag cat ctg cga egg teg 576 

Ser Arg Glu Ala Leu His His Leu Asn He Gin His Leu Arg Arg Ser 
180 185 190 

cat age ctt cgc cat ggc agg cgt cat acc cgc aaa ggc gaa aga ggt 624 

45 His Ser Leu Arg His Gly Arg Arg His Thr Arg Lys Gly Glu Arg Gly 
195 200 205 

acg att aac ata gtg aac gga aca cca att cac gaa cgt tec cga aat 672 

Thr He Asn He Val Asn Gly Thr Pro He His G^u Arg Ser Arg Asn 

210 215 220 



ate gat aac aga cgc tct eta ggg cat tgg gag ggc gat tta gtc tea 720 
He Asp Asn Arg Arg Ser Leu Gly His Trp Glu Gly Asp Leu Val Ser 
225 230 235 240 



55 
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30 



45 



50 



55 



ggt aca aaa aac tct cat ata gcc aca ctt gta gac cga aaa tea cgt 
Gly Thr Lys Asn Ser His He Ala Thr Leu Val Asp Arg Lys Ser Arg 
245 250 255 



768 



tat acg ate ate ctt aga etc agg ggc aaa gat tct gtc tea gta aat 816 
Tyc Thr He He Leu Arg Leu Arg Gly Lys Asp Ser Val Ser Val Asn 
260 265 270 



ctg ctt aat gag gtc gga. ate gaa ggt tta aca acc cgt aaa etc gcc 
Leu Leu Asn Glu Val Gly He Glu Gly Leu Thr Thr Arg Lys Leu Ala 
405 410 415 



gga gca aaa gta eat tea gat aca egg cct aca gaa aaa cag tat gaa 



912 



960 



cag get ctt acc gac aaa ttc ctg agt tta ccg tea gaa etc aga aaa 864 

Gin Ala Leu Thr Asp Lys Phe Leu Ser Leu Pro Ser Glu Leu Arg Lys 

10 275 280 285 

tea ctg aca tgg gac aga gga atg gaa ctg gcc aga cat eta gaa ttt 

Ser Leu Thr Trp Asp Arg Gly Met Glu Leu Ala Arg His Leu Glu Phe 

290 295 300 

act gtc age acc ggc gtt aaa gtt tac ttc tgc gat cct cag agt cct 

Thr Val Ser Thr Gly Val Lys Val Tyr Phe Cys Asp Pro Gin Ser Pro 

305 310 315 320 

tgg cag egg gga aca aat gag aac aca aat ggg eta att egg cag tac 1008 

20 Trp Gin Arg Gly Thr Asn Glu Asn Thr Asn Gly Leu He Arg Gin Tyr 

325 330 335 

ttt cct aaa aag aca tgt ctt gee caa tat act caa cat gaa eta gat 1056 

Phe Pro Lys Lys Thr Cys Leu Ala Gin Tyr Thr Gin His Glu Leu Asp 
340 345 350 

25 

ctg gtt get get cag eta aac aac aga ccg aga aag aca ctg aag ttc 1104 

Leu Val Ala Ala Gin Leu Asn Asn Arg Pro Arg Lys Thr Leu Lys Phe 

355 360 365 

aaa aca ccg aaa gag ata att gaa agg ggt gtt gca ttg aca gat gaa 1152 

Lys Thr Pro Lys Glu lie He Glu Arg Gly Val Ala Leu Thr Asp Glu 

370 375 380 

ttc agg tct aga tta gat aaa agt aaa gtg att aac age gca tta gag 1200 

Phe Arg Ser Arg Leu Asp Lys Ser Lys Val He Asn Ser Ala Leu Glu 

35 3 8 5 ' 390 395 400 



124B 



40 cag aag ctt ggt gta gag cag cct aca ctg tat tgg eat gta aaa aat 1296 

Gin Lys Leu Gly Val Glu Gin Pro Thr Leu Tyr Trp His Val Lys Asn 
420 425 430 



aag egg get ttg etc gac gcc tta gcc att gag atg tta gat agg cac 1314 

Lys Arg Ala Leu Leu Asp Ala Leu Ala Tie Glu Met Leu Asp Arg His 
435 440 445 

cat act cac ttt tgc cct tta aaa ggg gaa age tgg caa gat ttt tta 1392 

His Thr His Phe Cys Pro Leu Lys Gly Glu Ser Trp Gin Asp Phe Leu 
450 455 460 

cgc aat aac get aaa agt ttt aga tgt get tta eta agt cat cgc aat 14 4 0 

Arg Asn Asn Ala Lys Ser Phe Arg Cys Ala Leu Leu Ser His Arg Asn 

465 470 475 480 



1488 
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Gly Ala Lys Val His Ser Asp Thr Arg Pro Thr Glu Lys Gin Tyr Glu 
485 490 495 

act etc gaa aat caa tta gec ttt tta tgc caa caa ggt ttt tea eta 1536 

Thr Leu Glu Asn Gin Leu Ala Phe Leu Cys Gin Gin Gly Phe Ser Leu 

500 505 510 

gag aac gcg tta tat gca etc age get gtg ggg cat ttt act tta ggt 1584 

Glu Asn Ala Leu Tyx Ala Leu Ser Ala Val Gly His Phe Thr Leu Gly 

515 ' 520 52*5 

tgc gta ttg gaa gat caa gag cat caa gtc get aaa gaa gaa agg gaa 1632 

Cys Val Leu Glu Asp Gin Glu His Gin Val Ala Lys Glu Glu Arg Glu 

530 535 540 

aca cct act act gat agt atg ccg cca tta tta cga caa get ate gaa 16B0 

Thr Pro Thr Thr Asp Ser Wet Pro Pro Leu Leu Arg Gin Ala lie Glu 

545 550 555 560 

tta ttt gat cac caa ggt gca gag cca gec ttc tta ttc ggc ctt gaa 1728 

Leu Phe Asp His Gin Gly Ala Glu Pro Ala Phe Leu Phe Gly Leu Glu 
20 5 6 5 5 7 0 5 7 5 

ttg ate ata tgc gga tta gaa aaa caa ctt aaa tgt gaa agt ggg tct 1776 

Leu He lie Cys Gly Leu Glu Lys Gin Leu Lys Cys Glu Ser Gly Ser 

580 585 590 



10 



15 



25 



act agt taa 1 78 5 
Thr Ser 



30 


<210> 


6 




<211> 


594 




<212> 


PRT 




<213> 


Artificial sequence 




<220> 




35 


<223> 


IS-tct fusion protein 




<400> 


6 




Met Arg Arg Thr Phe Thr Ala < 



40 



45 



10 15 



Trp Lys Asn Gly Thr Gly Phe Ser Glu He Ala Asn lie Leu Gly Ser 
20 25 30 



Lys Pro Gly Thr lie Phe Thr Met Leu Arg Asp Thr Gly Gly lie Lys 
35 40 45 



Pro His Glu Arg Lys Arg Ala Val Ala His Leu Thr Leu Ser Glu Arg 
50 50 55 60 



Glu Glu He Arg Ala Gly Leu Ser Ala Lys Met Ser He Arg Ala He 
65 70 75 80 

55 
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Ala Thr Ala Leu Asn Arg Ser Pro Ser Thr lie Ser Arg Glu Val Gin 
85 90 95 



Arg Asn Arg Gly Arg Arg Tyr Tyr Lys Ala Val Asp Ala Asn Asn Arg 
100 105 HO 



10 



Ala Asn Arg Met Ala Lys Arg Pro Lys Pro Cys Leu Leu Asp Gin Asn 
115 120 125 



15 



Leu Pro Leu Arg Lys Leu Val Leu Glu Lys Leu Glu Met Lys Trp Ser 
130 135 14 0 



Pro Glu Gin He Ser Gly Trp Leu Arg Arg Thr Lys Pro Arg Gin Lys 
145 150 155 160 



20 



Thr Leu Arg He Ser Pro Glu Thr He Tyr Lys Thr Leu Tyr Phe Arg 
165 170 175 



25 



Ser Arg Glu Ala Leu His His Leu Asn He Gin His Leu Arg Arg Ser 
180 185 190 



His Ser Leu Arg His Gly Arg Arg His Thr Arg Lys Gly Glu Arg Gly 
195 200 205 



30 



Thr He Asn He Val Asn Gly Thr Pro He His Glu Arg Ser Arg Asn 
210 215 220 



35 



He Asp Asn Arg Arg Ser Leu Gly His Trp Glu Gly Asp Leu Val Ser 
225 " 230 235 240 



40 



Gly Thr Lys Asn Ser His He Ala Thr Leu Val Asp Arg Lys Ser Arg 
245 250 255 



Tyr Thr He He Leu Arg Leu Arg Gly Lys Asp Ser Val Ser Val Asn 
260 265 270 



45 



Gin Ala Leu Thr Asp Lys Phe Leu Ser Leu Pro Ser Glu Leu Arg Lys 
275 280 285 



50 



Ser Leu Thr Trp Asp Arg Gly Met Glu Leu Ala Arg His Leu Glu Phe 
290 295 300 



Thr Val Ser Thr Gly Val Lys Val Tyr Phe Cys Asp Pro Gin Ser Pro 
305 " 310 315 320 



55 
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Trp Gin Arg Gly Thr Asn Glu Asn Thr Asn Gly Leu He Arg Gin Tyr 
325 330 335 



Phe Pro Lys Lys Thr Cys Leu Ala Gin Tyr Thr Gin His Glu Leu Asp 
340 345 350 



Leu Val Ala Ala Gin Leu Asn Asn Arg Pro Arg Lys Thr Leu Lys Phe 
355 360 365 



Lys Thr Pro Lys Glu He" He Glu Arg Gly Val Ala Leu Thr Asp Glu 
370 375 380 



Phe Arg Ser Arg Leu Asp Lys Ser Lys Val He Asn Ser Ala Leu Glu 
385 390 395 400 



Leu Leu Asn Glu Val Gly He Glu Gly Leu Thr Thr Arg Lys Leu Ala 
405 410 415 



Gin Lys Leu Gly Val Glu Gin Pro Thr Leu Tyr Trp His Val Lys Asn 
420 425 430 



Lys Arg Ala Leu Leu Asp Ala Leu Ala He Glu Met Leu Asp Arg His 
435 440 445 



His Thr His Phe Cys Pro Leu Lys Gly Glu Ser Trp Gin Asp Phe Leu 
450 * 455 4 60 



Arg Asn Asn Ala Lys Ser Phe Arg Cys Ala Leu Leu Ser His Arg Asn 
465 470 475 480 



Gly Ala Lys Val His Ser Asp Thr Arg Pro Thr Glu Lys Gin Tyr Glu 
465 490 495 



Thr Leu Glu Asn Gin Leu Ala Phe Leu Cys Gin Gin Gly Phe Ser Leu 
500 505 510 



Glu Asn Ala Leu Tyr Ala Leu Scr Ala Val Gly His Phe Thr Leu Gly 
515 520 525 



Cys Val Leu Glu Asp Gin Glu His Gin Val Ala Lys Glu Glu Arg Glu 
530 535 540 



Thr Pro Thr Thr Asp Ser Met Pro Pro Leu Leu Arg Gin Ala Tie Glu 
545 550 555 560 



Leu Phe Asp His Gin Gly Ala Glu Pro Ala Phe Leu Phe Gly Leu Glu 
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565 



57 0 



575 



5 Leu lie lie Cys Gly Leu Glu Lys Gin Leu Lys Cys Glu Ser Gly Ser 

560 585 590 



Thr Ser 



10 



15 

Claims 

1. A fusion protein comprising at least : (i) a site-directed recombinase protein, or a fragment or a variant thereof and, 
(ii) a heterologous protein, or a fragment or a variant thereof, that binds either directly or indirectly DNA at least at 
20 one responsive element (RE), said fusion protein having a heterologous site-directed recombinase activity such 

that in a cell or cell free system containing said responsive element, said fusion protein binds directly or indirectly 
to said responsive element and catalyzes recombination in the vicinity or at said DNA responsive element in pres- 
ence of site-directed recombinase targeting sequence {SDRTS). 

25 2. Fusion protein according to claim 1 wherein the endogenous DNA-binding domain of said site-directed recombi- 
nase protein or a fragment or a variant thereof, is not functional such that said fusion protein does not catalyze 
recombination at the natural endogenous targeting sequence of said site-directed recombinase protein. 

3. Fusion protein according to claims 1 and 2 wherein said site-directed recombinase is selected among prokaryotic 
30 site-directed recombinases and eukaryotic site-directed recombinases. 

4. Fusion protein according to claim 3 wherein said site-directed recombinases are selected in the group comprising 
transposases and DNA recombinases. 

35 5. Fusion protein according to claim 4 wherein said transposase is selected from the transposases encoded by se- 
quences derived from a transposable element selected from the group consisting of: 

- transposons Tn3, Tn5, Tn7, Tn10, Tn916; and 

- procaryotic mobile elements IS1, IS2, IS3, IS5, IS10L, IS10L/R-2, IS26, IS30, IS50, IS150, IS1B6, IS911, and 
40 - eukaryotic mobile elements Tc1 of Caenorhabditis eiegans, the P and Copia elements of DrosophHa, the syn- 
thetic elementSleeping Beauty, Hobo, Mariner, Ac-Ds Spm, the Mu elements of maize, Ty1 , Ty2, Ty3 elements 
of yeast, Tam1 , Tam2 and Tam3 elements of Antirrhinum, Tx1 and Tx2 elements of Xenopus; and 
retrotransposons such as blood, gypsy, springer and beagle elements of DrosophHa; and 

- integrases of retroviruses RSV, SNV, Mo-MLV, MMTV, HTLV1 and HIV1 ; 

45 

and their fragments or variants thereof. 

6. Fusion protein according to claim 5 wherein said transposase is selected from the transposases able to form 
covalently joined SDRTS sequences such as sequences derived from procaryotic mobile elements IS1 , IS2, IS3, 

50 |S5, IS10L, IS10L/R-2, IS26, IS30, IS50, IS150, IS186, IS911. 

7. Fusion protein according to claim 6 wherein said transposase is IS30 transposase from Escherichia cofi. 

8. Fusion protein according to claim 5 wherein said transposase is selected from the transposases able to form 
55 covalently joined SDRTS sequences such as eukaryotic mobile elements Tc1 of Caenorhabditis elegans, the P 

and Copia elements of DrosophHa, Ac-Ds Spm and Mu elements of maize, Ty1 , Ty2, Ty3 elements of yeast, Tam1 , 
Tam2 and Tam3 elements of Antirrhinum, Tx1 and Tx2 elements of Xenopus; 
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9. Fusion protein according to claim 4 wherein said DNA recombinase is selected from the group of site-directed 
recombinases comprising the Cre recombinase of bacteriophage P1, the FLP recombinase of Saccharomyces 
cerevisiae, the R recombinase of Zygosaccharomyces rouxil pSR1, the A recombinase of Kluyveromyces dro- 
sophilarium pKD1, the A recombinase of Kluyveromyces wattii pKW1 , the integrase A, Int, the recombinase of the 
5 GIN recombination system of the Mu bacteriophage, the recombinase of the CIN recombination system of P1 

bacteriophage, the recombinase of the MIN recombination system of P 1-1 5 bacteriophage, the bacterial p recom- 
binase, the invertase of the Hin system of Salmonella, the invertase of the FIM system of Escherichia, the integrase 
of the SLP1 system of Streptomyces or variants thereof. 

w 10. Fusion protein according to claims 1 to 9 wherein said heterologous protein, or a fragment or a variant thereof 
comprises at least a DNA binding domain, said heterologous protein being selected among site-specific recombi- 
nases, transposases, procaryotic host factors, procaryotic activator proteins, procaryotic repressor proteins, eu- 
karyotic transcription factors, DNA methylases, restriction endonucleases, chromatin proteins, or a fragment or a 
variant thereof. 

15 

11. Fusion protein according to claim 10 wherein said heterologous protein is the procaryotic repressor protein cl of 
the lambda phage, and the variants thereof. 

12. Fusion protein according to claim 10 wherein said heterologous protein is the eukaryotic zinc-finger transcription 
20 factor Gli, and the variants thereof. 

13. Fusion protein according to claim 1 0 wherein said heterologous protein is the Tet repressor and its variants thereof 
encoded by the tetracycline resistance gene. 

25 14. Fusion protein according to claims 1 to 9 wherein said heterologous protein, or a fragment or a variant thereof that 
binds directly or indirectly DNA is selected among proteins that are known to associate with DNA binding factor. 

15. Fusion protein according to claims 1 to 14 wherein said DNA responsive element is selected among operator 
region Df procaryotic repressors, methylation sites of sequence-specific methylases, recognition sites of restriction 

30 endonucleases, binding sites of host factors, recognition sequence of site-specific recombinases, hot spot se- 

quences for transposases, transcription factors responsive elements, the phages operator region, the CpG island, 
LHS, GOHS, IS30 recognition sequence, {!S30) 2 . 

16. Fusion protein according to claim 15 wherein said DNA responsive element is the operator region of phage Xc\ 
35 repressor. 

17. Fusbn protein according to claims 1 and 2 wherein said heterologous protein is the cl repressor of the lambda 
phage or a fragment or a variant thereof and said transposase is IS30 transposase from Escherichia coti or a 
fragment or a variant thereof. 

40 

18. Fusion protein according to claim 17 of sequence SEQ ID N° 2. 

19. Fusion protein according to claims 1 and 2 wherein said heterologous protein is the DNA-binding domain of Gli 
transcription factor or a fragment or a variant thereof and said transposase is IS30 transposase from Escherichia 

45 coli or a fragment or a variant thereof. 

20. Fusion protein according to claims 19 of sequence SEQ ID N° 4. 

21. Fusion protein according to claims 1 and 2 wherein said heterologous protein is Tet repressor or a fragment or a 
so variant thereof and said transposase is IS30 transposase from Escherichia coli or a fragment or a variant thereof. 

22. Fusion protein according to claim 21 of sequence SEQ ID N° 6. 

23. Recombinant polynucleotide encoding for a protein fusion according to claims 1 to 22. 

55 

24. Recombinant polynucleotide of sequence SEQ ID N° 1 encoding for the fusion protein according to claim 18. 

25. Recombinant polynucleotide of sequence SEQ ID N° 3 encoding for the fusion protein according to claim 20. 
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26. Recombinant polynucleotide of sequence SEQ ID N° 5 encoding for the fusion protein according to claim 22. 

27. Recombinant polynucleotide comprising a polynucleotide selected among : 

a) The polynucleotide according to claims 23 to 26 ; 

b) The polynucleotides presenting at least 70% identity after optimal alignment with the polynucleotide of step 

a); 

c) Fragments of polynucleotides of claims 23 to 26 encoding for a peptide having at least a site-directed 
recombinase activity; 

d) The complementary sequence or RNA sequence corresponding to a polynucleotide of step a), b) or c). 

28. A DNA cassette comprising a promoter operably linked to a polynucleotide according to claims 23 to 27. 

29. Vector comprising a polynucleotide according to claims 23 to 27. 

30. Expression vector comprising at least one DNA cassette according to claim 28. 

31. Gene targeting vector comprising at least: 

a) A gene encoding a fusion protein according to claims 1 to 22 or a polynucleotide according to claims 23 to 27 ; 

b) Optionally, a promoter that is aperatively linked to said fusion protein gene or to said polynucleotide; 

c) Optionally, a DNA responsive element recognized by the DNA binding domain of said fusion protein ; 

d) At least, two SDRTS recognized by the recombinase domain of said fusion protein ; 

e) At least, one transposable DNA sequence of interest, wherein said DNA sequence of interest is located 
between said two SDRTS sequences ; 

f) Optionally, at least one marker gene; 

and wherein one of said SDRTS is located between said fusion protein encoding gene or said polynucleotide and 
said DNA sequence of interest. 

32. Gene targeting vector comprising at least: 

a) A gene encoding a fusion protein according to claims 1 to22or a polynucleotide according to claims 23 to 27 ; 

b) Optionally a promoter that is operatively linked to said fusion protein gene or to said polynucleotide; 

c) Optionally, one DNA responsive element recognized by the DNA binding domain of said fusion protein ; 

d) At leasttwo SDRTS recognized by the recombinase domain of said fusion protein, said SDRTS being joined 
covalently together and being separated at most by 20bp; 

e) Optionally, at least one marker gene. 

33. Gene targeting vector comprising: 

a) Optionally, a DNA responsive element recognized by the DNA binding domain of a fusion protein according 
to claims 1 to 22 ; 

b) At least, two SDRTS recognized by the recombinase domain of said fusion protein according to claims 1 
to 22 ; 

c) At least, one transposable DNA sequence of interest, wherein said gene is located between said two SDRTS 
sequences ; 

d) Optionally, at least one marker gene. 

34. Vector according to claims 30 to 33 wherein the said promoter is inducible by an inducing stimulus to cause the 
transcription of said fusion protein encoding gene or said polynucleotide. 

35. A vector according to claims 29 to 34 whereby said vector is derived from a viral vector, preferably an adenoviral, 
retroviral, or adeno-associated viral vector. 

36. A kit to perform site-directed recombination wherein said kit comprises at least : 

(i) a gene encoding a fusion protein according to claims 1 to 22 or a polynucleotide according to claims 23 to 
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27, optionally a promoter that is aperatively linked to said fusion protein gene or to said polynucleotide, or a 
DNA cassette according to claim 28, or a vector according to one of claims 29 or 30 ; and 
<ii) a donor DNA molecule that comprises a transposable DNA sequence of interest, said DNA sequence of 
interest being flanked at its 5' and/or 3 1 ends by said SDRTS, said SDRTS being specifically recognized by 
5 the recombinase domain of the fusion protein encoded by a gene, a polynucleotide, a DNA cassette, a vector 

of step (i) ; 

(iil) Optionally, a recipient DNA molecule that comprises at least one DNA responsive element that binds the 
DNA-binding domain of the fusion protein encoded by a gene, a polynucleotide, a DNA cassette or a vector 
of{i). 

w 

37. A kit to perform site-directed recombination wherein said kit comprises at least : 
. (i) a fusion protein according to claims 1 to 22 ; and 

(ii) a donor DNA molecule that comprises a transposable DNA sequence of interest, said DNA sequence of 
15 interest being flanked at its 5' and/or 3' ends by said SDRTS, said SDRTS being specifically recognized by 

the recombinase domain of the fusion protein of (I) ; 

(iii) optionally, a recipient DNA molecule that comprises at least one DNA responsive element that binds the 
DNA-binding domain of the fusion protein of (i). 

20 38. A kit to perform site-directed recombination wherein said kit comprises at least : 

(i) a gene targeting vector according to claims 31 to 33 ; and 

(ii) optionally, a recipient DNA molecule that comprises at least one DNA responsive element that binds the 
DNA-binding domain of the fusion protein of (i). 

25 

39. A host cell transformed by at least one polynucleotide according to claims 23 to 27, one DNA cassette according 
to claim 28, one vector according to claims 29 to 35. 

40. Cell according to claim 39 wherein the protein fusion encoded by said polynucleotide according to claims 23 to 
30 27, DNA cassette according to claim 28, or vector according to claims 29 to 35, is expressed and biologically active 

in said cell. 

41. Cell according tDane of claims 39 or 40 wherein said cell is an eukaryoticcell selected among human cells, murine 
cells, yeast cells, amphibian cells, fish cells, drosophila cells, Caenorhabditis cells, plant cells. 

35 

42. Cell according to one of claims 39 or 40 wherein said cell is a procaryotic cell selected among Escherichia sp. t 
Bacillus sp., Campylobacter sp., Helicobacter sp., Agrobacferium sp., Staphylococcus sp., Therrnophilus sp., 
Azorhizobium sp. t Rhizobium sp., Neisseria sp, Neisseria sp., Pseudomonas sp. t Mycobacterium sp., Streptomy- 
ces sp., Corynebacterium sp., Lactobacillus sp, Micrococcus sp., \ersinia sp., Brucella sp, Bortadella sp., Proteus 

40 sp, Klebsiella sp. f Erwinia sp, Vibrio sp., Photorhabdus sp, desulfbvibrio sp., Listeria sp., Clostridium sp, Actyn- 

omyces sp. t Haemophilus sp. 

43. Animal, excepted humans, comprising at least a cell according to claims 39 to 41. 
45 44. Animal according to claim 43 wherein said animal is a mouse. 

45. A method for in vitro site-directed DNA recombination, said method comprising the steps of combining : 

a) a donor DNA molecule that comprises a transposable DNA sequence of interest, the DNA sequence of 
so interest being flanked at its 5' and/or 3' ends by said SDRTS being specifically recognized by the recombinase 

domain of the fusion protein according to claims 1 to 22 ; with 

b) a recipient DNA molecule that comprises at least one DNA responsive element that binds the DNA-binding 
domain of the fusion protein according to claims 1 to 22 ; with 

c) a fusion protein according to claims 1 to 22. 

55 

46. A method for in vivo site-directed DNA recombination, said method comprising the steps of combining into a cell: 

a) A donor DNA molecule that comprises a transposable DNA sequence of interest, the DNA sequence of 
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interest being flanked at its 5' and/or 3' ends by SDRTS being specifically recognized by the recDmbinase 
domain of the fusion protein according to claims 1 to 22 ; with 

b) A recipient DNA molecule that comprises at least Dne DNA responsive element that binds the DNA-binding 
domain of the fusion protein according to claims 1 to 22 ; with 

5 c) A fusion protein according to claims 1 to 22 or a gene encoding a fusion protein according to claims 1 to 

22 or a polynucleotide according to claims 23 to 27, optionally a promoter that is operativery linked to said 
fusion protein gene or to said polynucleotide, or a DNA cassette according to claim 28, or a vector according 
to claims 29 and 30, said gene or polynucleotide being efficiently expressed in said cell. 

10 47. A method according to one of claims 45 or 46 wherein the DNA recombination is selected among insertion, deletion, 
inversion, translocation, fusion. 

48. A method for the stable introduction of a DNA sequence of interest into at least one recipient DNA molecule of a 
cell, said recipient DNA molecule comprises at least one DNA responsive element that binds the DNA-binding 

15 domain of the fusion protein according to claims 1 tD 22, and said method comprising : 

a) providing a donor DNA molecule that comprises a transposable DNA sequence of interest, the DNA se- 
quence of interest being flanked at its 5' and/or 3* ends by said SDRTS being specifically recognized by the 
recombinase domain of the fusion protein according to claims 1 to 22 ; 
20 b) introducing into the cell said donor DNA molecule; and, 

c) previously, simultaneously, or separately, introducing into said cell a fusion protein according to claims 1 to 
21 or a gene encoding a fusion protein according to claims 1 to 22 or a polynucleotide according to claims 23 
to 27, optionally a promoter that is operatively linked to said fusion protein gene or to said polynucleotide, or 
a DNA cassette according tD claim 28, or a vector according to one of claims 29 or 30, said gene or polynu- 

25 cleotide being efficiently expressed in said cell. 

49. Use of a vector according to claims 31 to 33 as a functional transposon in integrating a nucleic acid sequence of 
interest into a genome of a cell. s 

30 50. Vector according to claims 29 to 35 as a medicament. 

51 . Use of a vector according to claim 50 to perform a gene therapy to a patient in need of such treatment. 

52. Use of a fusion protein according to claims 1 to 22, Dr a polynucleotide according to claims 23 to 27, or a vector 
35 according to claims 29 to 35, or a DNA cassette according to claim 28, or a kit according to claims 36 and 38, for 

producing transgenic cells and/or transgenic animals. 

53. Use of a fusion protein according to claims 1 to 22, or a polynucleotide according to claims 23 to 27, or a vector 
according to claims 29 to 35, or a DNA cassette according to claim 28, Dr a kit according to claims 36 and 38 to 

40 perform gene targeting, gene knock-out (KO), gene knock-in (Kl), gene-trapping, transposon tagging. 
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